pandas Flashcards

Question 1

Q

Create a DataFrame df from this dictionary data which has the index labels.

Answer

A

df = pd.DataFrame(data, index=labels)

Question 2

Q

Display a summary of the basic information about this DataFrame and its data.

Answer

A

df.info()

…or…

df.describe()

Question 3

Q

Return the first 3 rows of the DataFrame df.

Answer

A

df.iloc[:3]

or equivalently

df.head(3)

Question 4

Q

Select just the ‘animal’ and ‘age’ columns from the DataFrame df.

Answer

A

df.loc[:, [‘animal’, ‘age’]]

or

df[[‘animal’, ‘age’]]

Question 5

Q

Select the data in rows [3, 4, 8] and in columns [‘animal’, ‘age’].

Answer

A

df.loc[df.index[[3, 4, 8]], [‘animal’, ‘age’]]

Question 6

Q

Select only the rows where the number of visits is greater than 3.

Answer

A

df[df[‘visits’] > 3]

Question 7

Q

Select the rows where the age is missing, i.e. is NaN.

Answer

A

df[df[‘age’].isnull()]

Question 8

Q

Select the rows where the animal is a cat and the age is less than 3.

Answer

A

df[(df[‘animal’] == ‘cat’) & (df[‘age’] < 3)]

Question 9

Q

Select the rows the age is between 2 and 4 (inclusive).

Answer

A

df[df[‘age’].between(2, 4)]

Question 10

Q

Change the age in row ‘f’ to 1.5.

Answer

A

df.loc[‘f’, ‘age’] = 1.5

Question 11

Q

Calculate the sum of all visits (the total number of visits).

Answer

A

df[‘visits’].sum()

Question 12

Q

Calculate the mean age for each different animal in df.

Answer

A

df.groupby(‘animal’)[‘age’].mean()

Question 13

Q

Append a new row ‘k’ to df with your choice of values for each column. Then delete that row to return the original DataFrame.

Answer

A

df.loc[‘k’] = [5.5, ‘dog’, ‘no’, 2]

and then deleting the new row…

df = df.drop(‘k’)

Question 14

Q

Count the number of each type of animal in df.

Answer

A

df[‘animal’].value_counts()

Question 15

Q

Sort df first by the values in the ‘age’ in decending order, then by the value in the ‘visit’ column in ascending order.

Answer

A

df.sort_values(by=[‘age’, ‘visits’], ascending=[False, True])

Question 16

Q

The ‘priority’ column contains the values ‘yes’ and ‘no’. Replace this column with a column of boolean values: ‘yes’ should be True and ‘no’ should be False.

Answer

Study These Flashcards

A

df[‘priority’] = df[‘priority’].map({‘yes’: True, ‘no’: False})

Question 17

Q

In the ‘animal’ column, change the ‘snake’ entries to ‘python’.

Answer

Study These Flashcards

A

df[‘animal’] = df[‘animal’].replace(‘snake’, ‘python’)

Question 18

Q

For each animal type and each number of visits, find the mean age. In other words, each row is an animal, each column is a number of visits and the values are the mean ages (hint: use a pivot table).

Answer

Study These Flashcards

A

df.pivot_table(index=’animal’, columns=’visits’, values=’age’, aggfunc=’mean’)

pandas Flashcards

importing, filtering, slicing, querying (18 cards)