advanced_pandas Flashcards

1
Q

Given a dataframe ‘purchases’, how do you add a new column named ‘Date’ with the following values?

  • December 1
  • January 1
  • mid-May
A
  • df[‘Date’] = [‘December 1’, ‘January 1’, ‘mid-May’]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  • What is a outer (full) join?
  • What is an inner (intersection) join?
A
  • Outer join: When there are two groups and we want to join everything between those two groups
  • Inner join: When we have two groups and we only want to join what both groups have in common
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  • Given two DataFrames ‘staff_df’ and ‘student_df’:
    • How do you outer join the two dataframes together?
    • How do you inner join the two datframes together?
    • How would you join if you wanted to get all ‘staff’ data, and any additional data if the staff were ‘students’ as well?
    • How would you join if you wanted to get all ‘student’ data, and any additional data if the students were ‘staff’ as well?
    • How would you join if you wanted to get all ‘staff’ data, and any additional data if the staff were ‘students’ as well AND you wanted to join on the specific common column ‘Name’?
A
  • pd.merge(staff_df, student_df, how=’outer’, left_index=True, right_index=True)
    • We have to tell Pandas that we want to ‘join on’ the left and right indices
  • pd.merge(staff_df, student_df, how=’inner’, left_index=True, right_index=True_
  • pd.merge(staff_df, student_df, how=’left’, left_index=True, right_index=True)
  • pd.merge(staff_df, student_df, how=’right’, left_index=True, right_index=True)
  • pd.merge(staff_df, student_df, how=’left’, left_on=’Name’, right_on=’Name’)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How would you merge the following two DataFrames together to generate totals into one DataFrame?

  • products DataFrame:

Price

Product

Product ID

4109

5.0

Sushi Roll

1412

0.5

Egg

8931

1.5

Bagel

  • invoices DataFrame

Customer

Product ID

Product

0

Ali

4109

1

1

Eric

1412

12

2

Ande

8931

6

3

Sam

4109

2

A
  • pd.merge(products, invoices, left_index=True, right_on=’Product ID’)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Given a DataFrame ‘df’, how would you write a single query to do the following?

  • Filter on ‘SUMLEV’ column that equals 50
  • Remove the NaN values
  • Set the index to the ‘STNAME’ and ‘CTYNAME’ columns
  • Rename the column ‘ESTIMATESBASE2010’ to ‘Estimates Base 2010’
A

df.where(df[‘SUMLEV’] == 50)

.dropna()

.set_index([‘STNAME’, ‘CTYNAME’])

.rename(columns={‘ESIMATESBASE2010’ : ‘Estimates Base 2010’} ) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Use method chaining to modify the DataFrame ‘df’ in one statement to drop any entries where ‘Quantity’ is 0 and rename the column ‘Weight’ to ‘Weight (oz.)’

A

df.drop(df[df[‘Quantity’] == 0].index)

.rename(columns={‘Weight’ : ‘Weight (oz.)’} ) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between axis=0 and axis=1 in Pandas?

A
  • It specifies the axis along which the means are computed.
    • axis=0 is within the column (vertical)
    • axis=1 is across the row (horizontal)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly