Numpy Flashcards

1
Q

Import numpy and assign to the alias np.

A

import numpy as np

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Create a NumPy ndarray from the list [10, 20, 30]. Assign the result to the variable data_ndarray.

A

data_ndarray = np.array([10,20,30])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference in using Numpy vs a List of lists

A

Our computer would take eight processor cycles to process the eight rows of our data.

The NumPy library takes advantage of a processor feature called Single Instruction Multiple Data (SIMD) to process data faster. SIMD allows a processor to perform the same operation, on multiple data points, in a single processor cycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Convert values from a list of lists to a numpy array

A

converted_taxi_list = []
for row in taxi_list:
converted_row = []
for item in row:
converted_row.append(float(item))
converted_taxi_list.append(converted_row)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Convert a variable called coverted_taxi_list to a numpy array

A

taxi = np.array(converted_taxi_list)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between tuples and lists

A

Tuples are very similar to Python lists, but can’t be modified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If data.shape returns the following tuple (2, 3) How do you interpret it?

A

The first number tells us that there are 2 rows in data_ndarray.
The second number tells us that there are 3 columns in data_ndarray.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you get the shape of a numpy array called taxi?

A

taxi_shape=(taxi.shape)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the sintax of any 2D array, for selecting data?

A

ndarray[row_index,column_index]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Slicing to select the items at index 1, 2, and 3

A

[1:4]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

lect the item at row index 21 and column index 5. Assign it to row_21_column_5

A

row_21_column_5= taxi[21,5]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Select every column for the rows at indexes 391 to 500 inclusive. Assign them to rows_391_to_500.

A

rows_391_to_500=taxi[391:501]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Select the row at index 0. Assign it to row_0.

A

row_0=taxi[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Select every row for the columns at indexes 1, 4, and 7. Assign them to columns_1_4_7.

A

columns = [1,4,7]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Select the columns at indexes 5 to 8 inclusive for the row at index 99. Assign them to row_99_columns_5_to_8.

A

row_99_columns_5_to_8= taxi[99,5:9]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Select the rows at indexes 100 to 200 inclusive for the column at index 14. Assign them to rows_100_to_200_column_14.

A

rows_100_to_200_column_14= taxi[100:201,14]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Use vector addition to add fare_amount and fees_amount. Assign the result to fare_and_fees.

A
fare_amount = taxi[:,9]
fees_amount = taxi[:,10]

fare_and_fees=fare_amount+fees_amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Select all of the rows from colums 0 and 1

A
col1 = my_numbers[:,0]
col2 = my_numbers[:,1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Use vector division to divide trip_distance_miles by trip_length_hours. Assign the result to trip_mph.

trip_distance_miles = taxi[:,7]
trip_length_seconds = taxi[:,8]
A
trip_length_hours = trip_length_seconds / 3600 
# 3600 seconds is one hour
trip_mph=trip_distance_miles/trip_length_hours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

calculate the maximum value of trip_mph. Assign the result to mph_max.

A

mph_max=trip_mph.max()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

calculate the average value of trip_mph. Assign the result to mph_mean.

A

mph_mean=trip_mph.mean()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Calculate the min value of trip_mph

A

mph_min = trip_mph.min()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Calculate the median average value of trip_mph

A

np.median(trip_mph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Function representations and method representations

A

Calculation Function Representation Method Representation
Calculate the minimum value of trip_mph np.min(trip_mph) trip_mph.min()
Calculate the maximum value of trip_mph np.max(trip_mph) trip_mph.max()
Calculate the mean average value of trip_mph np.mean(trip_mph) trip_mph.mean()
Calculate the median average value of trip_mph np.median(trip_mph) There is no ndarray median method

25
Q

Use the ndarray.sum() method to calculate the sum of each row in fare_components. Assign the result to fare_sums.

A

fare_sums = fare_components.sum(axis=1)

26
Q

Extract the 14th column in taxi_first_five. Assign to fare_totals.

A

fare_totals= taxi_first_five[:,13]

27
Q

Use the numpy.genfromtxt() function to read the nyc_taxis.csv file into NumPy. Assign the result to taxi.

A

taxi=np.genfromtxt(“nyc_taxis.csv”, delimiter=”,”)

28
Q

Evaluate whether the elements in array a are less than 3. Assign the result to a_bool.
a = np.array([1, 2, 3, 4, 5])
b = np.array([“blue”, “blue”, “red”, “blue”])
c = np.array([80.0, 103.4, 96.9, 200.3])

A

a_bool = a < 3

29
Q

Evaluate whether the elements in array b are equal to “blue”. Assign the result to b_bool.
a = np.array([1, 2, 3, 4, 5])
b = np.array([“blue”, “blue”, “red”, “blue”])
c = np.array([80.0, 103.4, 96.9, 200.3])

A

b_bool =b == “blue”

30
Q

valuate whether the elements in array c are greater than 100. Assign the result to c_bool.

A

c_bool = c > 100

31
Q

Create a boolean array, february_bool, that evaluates whether the items in pickup_month are equal to 2.

A

february_bool = pickup_month== 2

32
Q

Use the february_bool boolean array to index pickup_month. Assign the result to february.

A

february = pickup_month[february_bool]

33
Q

Use the ndarray.shape attribute to find the number of items in february. Assign the result to february_rides.

A

february_rides = february.shape[0]

34
Q

Create a boolean array, tip_bool, that determines which rows have values for the tip_amount column of more than 50.

tip_amount = taxi[:,12]

A

tip_bool= tip_amount > 50

35
Q

Use the tip_bool array to select all rows from taxi with values tip amounts of more than 50, and the columns from indexes 5 to 13 inclusive. Assign the resulting array to top_tips.

A

top_tips= taxi[tip_bool,5:14]

36
Q

a = np.array([‘red’,’blue’,’black’,’blue’,’purple’])
a[0] = ‘orange’
print(a)

A

[‘orange’, ‘blue’, ‘black’, ‘blue’, ‘purple’]

37
Q

a = np.array([‘red’,’blue’,’black’,’blue’,’purple’])
a[3:] = ‘pink’
print(a)

A

[‘orange’, ‘blue’, ‘black’, ‘pink’, ‘pink’]

38
Q
ones = np.array([[1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1]])
ones[1,2] = 99
print(ones)
A

[[ 1, 1, 1, 1, 1],
[ 1, 1, 99, 1, 1],
[ 1, 1, 1, 1, 1]]

39
Q
ones = np.array([[1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1]])
ones[0] = 42
print(ones)
A

[[42, 42, 42, 42, 42],
[ 1, 1, 99, 1, 1],
[ 1, 1, 1, 1, 1]]

40
Q
ones = np.array([[1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1],
                 [1, 1, 1, 1, 1]])
ones[:,2] = 0
print(ones)
A

[[42, 42, 0, 42, 42],
[ 1, 1, 0, 1, 1],
[ 1, 1, 0, 1, 1]]

41
Q

The value at column index 5 (pickup_location) of row index 28214 is incorrect. Use assignment to change this value to 1 in the taxi_modified ndarray.

A

taxi_modified[28214,5]= 1

42
Q

The first column (index 0) contains year values as four digit numbers in the format YYYY (2016, since all trips in our data set are from 2016). Use assignment to change these values to the YY format (16) in the taxi_modified ndarray.

A

taxi_modified[:,0]=16

43
Q

The values at column index 7 (trip_distance) of rows index 1800 and 1801 are incorrect. Use assignment to change these values in the taxi_modified ndarray to the mean value for that column.

A

taxi_modified=[1800:1802,7]= taxi_modified[:,7].mean()

44
Q

a2 = np.array([1, 2, 3, 4, 5])

a2_bool = a2 > 2

a2[a2_bool] = 99

print(a2)

A

[ 1 2 99 99 99]

45
Q

Select the fourteenth column (index 13) in taxi_copy. Assign it to a variable named total_amount.

A

total_amount= taxi_copy[:,13]

46
Q

For rows where the value of total_amount is less than 0, use assignment to change the value to 0.

A

total_amount[total_amount <0] =0

47
Q

taxi_modified with an additional column containing the value 0 for every row.

A
zeros = np.zeros([taxi.shape[0], 1])
taxi_modified = np.concatenate([taxi, zeros], axis=1)
48
Q

For rows where the value for the column index 5 is equal to 2 (JFK Airport), assign the value 1 to column index 15.

A

taxi_modified[taxi_modified[:,5]==2,15]=1

49
Q

For rows where the value for the column index 5 is equal to 3 (LaGuardia Airport), assign the value 1 to column index 15.

A

taxi_modified[taxi_modified[:,5]==3,15]=1

50
Q

For rows where the value for the column index 5 is equal to 5 (Newark Airport), assign the value 1 to column index 15.

A

taxi_modified[taxi_modified[:,5]==5,15]=1

51
Q

Use boolean indexing to select only the rows where the dropoff_location_code column (column index 6) has a value that corresponds to JFK (number 2). Assign the result to jfk.

A

jfk= taxi[taxi[:,6]==2]

52
Q

Calculate how many rows are in the new jfk array and assign the result to jfk_count.

jfk= taxi[taxi[:,6]==2]

A

jfk_count= jfk.shape[0]

53
Q

Calculate how many trips from taxi had Laguardia Airport as their destination:
Use boolean indexing to select only the rows where the dropoff_location_code column (column index 6) has a value that corresponds to Laguardia (number 3). Assign the result to laguardia.

A

laguardia=taxi[taxi[:,6]==3]

54
Q

Calculate how many rows are in the new laguardia array. Assign the result to laguardia_count.

A

laguardia_count=laguardia.shape[0]

55
Q

reate a new ndarray, cleaned_taxi, containing only rows for which the values of trip_mph are less than 100.

trip_mph = taxi[:,7] / (taxi[:,8] / 3600)

A

cleaned_taxi= taxi[trip_mph<100]

56
Q

Calculate the mean of the trip_distance column of cleaned_taxi. Assign the result to mean_distance.

A

mean_distance=cleaned_taxi[:,7].mean()

57
Q

Calculate the mean of the trip_length column of cleaned_taxi. Assign the result to mean_length.

A

mean_length=cleaned_taxi[:,8].mean()

58
Q

Calculate the mean of the total_amount column of cleaned_taxi. Assign the result to mean_total_amount.

A

mean_total_amount=cleaned_taxi[:,13].mean()