Pandas Pt 2 (UCSD) Flashcards

Question 1

Q

replace all of the cells in a dataframe that have value 9999 with 0

Answer

A

df = df.replace(9999, 0)

Question 2

Q

fill missing values in a dataframe with the last known value before it, or after it

Answer

A

df. fillna(method=’ffill’) ##forward

df. fillna(method=’backfill’) ##backward

Question 3

Q

drop rows or columns with any NaN values

Answer

A

df. dropna(axis=0) ## rows

df. dropna(axix=1) ## columns

Question 4

Q

interperolate missing values. default is linear interpolation

Answer

A

df.interpolate() ## fills in missing values using a linear interpolation, but there are others

Question 5

Q

create a dataframe with boolean values, where TRUE is set for any null values

Answer

A

df.isnull()

Question 6

Q

some common plot functions (but many more), df.plot.func()

Answer

A

funcs = bar(), box(), hist(), plot(), line(), pie(), scatter()
## would call differently in jupyter (w/o .plot)

Question 7

Q

use a magic function in jupyter to use matplotlib in jupyter

Answer

A

%matplotlib inline

Question 8

Q

get the histogram of the ratings column of the df dataframe in jupyter notebook

Answer

A

df.hist( column = ‘ratings’, figsize = (15,10) ) ## figsize is the size that will be plotted in the notebook

Question 9

Q

get the boxplot of the ratings column of the df dataframe in jupyter notebook

Answer

A

df.boxplot( column = ‘ratings’, figsize = (15,10) ) ## figsize is the size that will be plotted in the notebook

Question 10

Q

return all of the rows from a dataframe where ‘col2’ values are greater than 5

Answer

A

df[ df[‘col2’ > 5]

Question 11

Q

delete rows indexed 5 and 6 in a dataframe

Answer

A

df.drop[ df.index[ 5,6 ] ] or df.drop[ ‘rowName’, row2name’ ]

Question 12

Q

delete column ‘col2’ from a dataframe

Answer

A

del df [ ‘col2’ ]

Question 13

Q

get the mean of rows aggregated on ‘studentID’

Answer

A

df.groupby(‘studentID’).mean() ## groupby aggregates the rows on the column specified

Question 14

Q

get the count of unique rows from a set of columns in a dataframe (each different permutation of columns is unique)

Answer

A

df[ [ list of cols ] ].value_counts() ## optional list of columns; otherwise entire df. gives the unique col permutations and the frequency of each

Pandas Pt 2 (UCSD) Flashcards

(14 cards)