Python Pandas Flashcards

Question 1

Q

Create a dictionary called my_dict with the following three key value pairs:
key ‘country’ and value names.
key ‘drives_right’ and value dr.
key ‘cars_per_cap’ and value cpc.

Answer

A

my_dict={‘country’: names, ‘drives_right’: dr, ‘cars_per_cap’: cpc}

Question 2

Q

Use pd.read_csv() to import cars.csv data as a DataFrame. Store this DataFrame as cars.

Answer

A

cars = pd.read_csv(‘cars.csv’)

Question 3

Q

Specify the index_col argument inside pd.read_csv(): set it to 0, so that the first column is used as row labels.

Answer

A

cars = pd.read_csv(‘cars.csv’, index_col=0)

Question 4

Q

Use single square brackets to print out the ‘country’ column of cars as a Pandas Series.

Answer

A

print(cars[‘country’])

Question 5

Q

Use double square brackets to print out the ‘country’ column of cars as a Pandas DataFrame.

Answer

A

print(cars[[‘country’]])

Question 6

Q

Use double square brackets to print out a DataFrame with both the ‘country’ and ‘drives_right’ columns of cars, in this order.

Answer

A

print(cars[[‘country’, ‘drives_right’]])

Question 7

Q

Select the first 3 observations from cars and print them out.

Answer

A

print(cars[0:3])

Question 8

Q

Select the fourth, fifth and sixth observation, corresponding to row indexes 3, 4 and 5, and print them out.

Answer

A

print(cars[3:6])

Question 9

Q

3 ways to inspect a dataframe

Answer

A

Inspect the first few rows (including index labels)
print(df.head())

Inspect the last few rows
print(df.tail())

Inspect random sample rows
print(df.sample(5))

Question 10

Q

Use loc or iloc to select the observation corresponding to Japan as a Series. The label of this row is JPN, the index is 2. Make sure to print the resulting Series.

Answer

A

print(cars.loc[‘JPN’])

print(cars.iloc[2])

Question 11

Q

Print out the ‘drives_right’ value of the row corresponding to Morocco (its row label is MOR)

Answer

A

print(cars.loc[‘MOR’, ‘drives_right’])

Question 12

Q

Print out a sub-DataFrame, containing the observations for Russia and Morocco and the columns ‘country’ and ‘drives_right’.

Answer

A

print(cars.loc[[‘RU’, ‘MOR’], [‘country’, ‘drives_right’]])

Question 13

Q

Print out from the df cars the drives_right column as a Series using loc

Answer

A

print(cars.loc[:,’drives_right’])

Question 14

Q

Print out the drives_right column as a DataFrame using loc

Answer

A

print(cars.loc[:, [‘drives_right’]])

Question 15

Q

Print out both the cars_per_cap and drives_right column as a DataFrame using loc

Answer

A

print(cars.loc[:, [‘cars_per_cap’, ‘drives_right’]])

Question 16

Q

Which areas in my_house are greater than 18.5 or smaller than 10?

Answer

A

print(np.logical_or(my_house > 18.5, my_house < 10))

Question 17

Q

Which areas are smaller than 11 in both my_house and your_house? Make sure to wrap both commands in print() statement, so that you can inspect the output.

Answer

A

print(np.logical_and(my_house < 11, your_house < 11))

Question 18

Q

make an if statement that prints out “looking around in the kitchen.” if room equals “kit”.

Answer

A

if room == “kit” :
print(“looking around in the kitchen.”)

Question 19

Q

Write another if statement that prints out “big place!” if area is greater than 15.

Answer

A

if area >15:
print(“big place!”)

Question 20

Q

Extract the drives_right column as a Pandas Series and store it as dr.

Answer

A

dr = cars[‘drives_right’]

Question 21

Q

Use dr, a boolean Series, to subset the ‘cars’ DataFrame. Store the resulting selection in ‘sel’.

Answer

A

sel = cars[dr]

Question 22

Q

Select the cars_per_cap column as a Pandas Series and store it as cpc

Answer

A

cpc = cars[‘cars_per_cap’]

Question 23

Q

Create car_maniac: observations that have a cars_per_cap over 500

Answer

A

cpc = cars[‘cars_per_cap’]
many_cars = cpc > 500 # This creates a boolean Series
print(many_cars)

Question 24

Q

Create medium: observations with cars_per_cap between 100 and 500

Answer

A

cpc = cars[‘cars_per_cap’]
between = np.logical_and(cpc > 100, cpc < 500)
medium = cars[between]

Question 25

Q

Write a for loop that iterates over all elements of the areas list (area in ‘areas’df)

Answer

A

for area in areas:
print(area)

Question 26

Q

write a for loop using enumerate(). Print() so that on each run, a line of the form “room x: y” should be printed, where x is the index of the list element and y is the actual list element, i.e. the area.

Answer

A

for index, area in enumerate(areas) :
print(‘room’+ str(index)+ ‘: ‘ + str(area))

Question 27

Q

adapt the following so that the first printout becomes “room 1: 11.25”, the second one “room 2: 18.0” and so on:

for index, area in enumerate(areas) :
print(“room” + str(index) + “: “ + str(area))

Answer

A

for index, area in enumerate(areas) :
print(“room” + str(index + 1) + “: “ + str(area))

Question 28

Q

Write a for loop that goes through each sublist of house and prints out the x is y sqm, where x is the name of the room and y is the area of the room.

Answer

A

for x in house:
print(“the “ + x[0] + “ is “ + str(x[1]) + “ sqm”)

Question 29

Q

Write a for loop that goes through each key:value pair of europe. On each iteration, “the capital of x is y” should be printed out, where x is the key and y is the value of the pair.

Answer

A

for key, value in europe.items():
print(“the capital of “ + str(key) + “ is “ + str(value))

Question 30

Q

Write a for loop that iterates over all elements in np_height and prints out “x inches” for each element, where x is the value in the array.

Answer

A

for x in np_height:
print(str(x) + “ inches”)

Question 31

Q

Write a for loop that visits every element of the np_baseball array and prints it out.

Answer

A

for x in np.nditer(np_baseball):
print(x)

Question 32

Q

Write a for loop that iterates over the rows of cars and on each iteration perform two print() calls: one to print out the row label and one to print out all of the rows contents.

Answer

A

for lab. row in cars,iterrows():
print(lab)
print(row)

Question 33

Q

add the length of the country names of the brics DataFrame in a new column

Answer

A

for lab, row in brics.iterrows() :
brics.loc[lab, “name_length”] = len(row[“country”])

Question 34

Q

Use a for loop to add a new column, named COUNTRY, that contains a uppercase version of the country names in the “country” column. You can use the string method upper() for this

Answer

A

for lab, row in cars.iterrows():
cars.loc[lab, “COUNTRY”] = row[“country”].upper()

everything in the ‘for loop’ is indented though

Question 35

Q

Brainscape's Knowledge GenomeTM

Python Pandas Flashcards

Brainscape's Knowledge Genome^TM