Transforming Data Flashcards
Function used to find missing data?
df.isna()
Function used to compile all the missing data
df.isna().sum()
Function used to drop all rows that are missing data?
df.dropna(inplace=True)
Using inplace=True makes the change stick
Function used just to drop only rows in one column?
df. dropna(subset=[‘Embarked’], inplace=True)
df. isna().sum()
Function used to drop column directly
df. drop(columns= ‘Cabinet, inplace=True)
df. isna().sum()
Function used to drop all columns that are missing any data?
df.dropna(axis=1, inplace=True
What argument can be used to decide how much incomplete data to drop in columns?
thresh= can be used to drop data with less than 45% of its data
df.dropna(axis=1, thresh=.45, inplace=True)
How to fill missing data with new category? What code is used?
df[‘Gender’].fillna(‘Missing, inplace=True)
Replaces Gender column with Missing
How to fill missing data with an average?
median_age = df[‘Age’].median()
df[‘Age’].fillna(median_age, inplace=True)
df.isna().sum()
How to fill categorical data with the most common value in the column?
most_common_pet = df[‘Pet Type’].mode()
df[‘Pet Type].fillna(most_common_pet, inplace=True