Pandas Flashcards
How to read an data.csv onto pandas dataframe df having ID as index column?
df = pd.read_csv('data.csv', index_col = 0)
how to extract count, mean, std, min 25%, … statistic information from pandas dataframe df ?
df.describe()
How to extract the column ‘z’ from df ?
df.loc[:, 'z']
use pandas method plot with no further parameter to plot a serie s?
s.plot()
What are the outputs of:
1. df.isna()
2. df.isna().sum()
3. df.isna().sum().sum()
what is the overall result after 3
- go trough all df cells and set true if is NaN
- get a serie where index are the columns and value the number o trues
- sum over serie values
the overall result: count of NaN over all df
how to get all raws from user Id 3292879998 as series, if the index col is user Id and dtype of this column ist Int64
df.loc[3292879998]
how to convert the index column of a pandas df from Float64 to Int64
df.index = df.index.astype(‘Int64’)
how to user lists of keys and values to generate a dict?
new_dict = dict(zip(key, values))
how to drop the first row of a pandas df if the index is NaN?
# Drop the first row df = df.drop(index=pd.NA)
how to count the number of atendents if antendent id is index column in a df?
len(df.index)
how to get all rows from df where gender not male or female
df.loc[~df['gender'].isin(['Male', 'Female'])]
how to substitute all cells where True with ‘no’ ?
df.replace(True, ‘no’, inplace=True)
how to select only col1 and col2 for df filtered with boolean condition bc?
df.loc[bc, ['col1', 'col2'] ]
how to count the rows of a df (or df selection)
df.shape[0]
count the number of rows for each different value of ‘location’ in df
df[‘location’].value_counts()