Data Manipulation With Pandas and Numpy Flashcards
How do you access a specific column from a dataframe?
df[‘Column1’]
How do you create a Pandas Dataframe from a dictionary?
data = {‘Column1’: [1, 2, 3], ‘Column2’: [4, 5, 6]}
df = pd.DataFrame(data)
How do you filter a dataframe where ‘Column1’ is greater than 2?
filtered_df = df[df[‘Column1’] > 2]
How do you fill missing values with the mean of a column?
df[‘Column1’].fillna(df[‘Column1’].mean(), inplace=True)
How do you group data in a dataframe by a specific column and calculate the mean for each group?
grouped_df = df.groupby(‘Column1’).mean()
How do you apply a function to every element in a dataframe column?
df[‘Column1’] = df[‘Column1’].apply(np.square)
How do you create a numpy array filled with zeros?
zeros_array = np.zeros((3, 3))
How do you access the element at the second row, third column of a NumPy array?
element = array[1, 2]
How do you drop a specific column from a dataframe?
df = df.drop(‘Column1’, axis=1)
How do you rename a column in a dataframe?
df = df.rename(columns={‘OldName’: ‘NewName’})
How do you concatenate two dataframes along rows (vertically)?
df_combined = pd.concat([df1, df2], axis=0)
How do you drop rows with missing values?
df_clean = df.dropna()
How do you access a specific row in a df?
row = df.iloc[3] # Accesses the 4th row (index 3)