Pandas Flashcards

1
Q

How to read an data.csv onto pandas dataframe df having ID as index column?

A

df = pd.read_csv('data.csv', index_col = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

how to extract count, mean, std, min 25%, … statistic information from pandas dataframe df ?

A

df.describe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to extract the column ‘z’ from df ?

A

df.loc[:, 'z']

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

use pandas method plot with no further parameter to plot a serie s?

A

s.plot()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the outputs of:
1. df.isna()
2. df.isna().sum()
3. df.isna().sum().sum()
what is the overall result after 3

A
  1. go trough all df cells and set true if is NaN
  2. get a serie where index are the columns and value the number o trues
  3. sum over serie values
    the overall result: count of NaN over all df
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to get all raws from user Id 3292879998 as series, if the index col is user Id and dtype of this column ist Int64

A

df.loc[3292879998]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how to convert the index column of a pandas df from Float64 to Int64

A

df.index = df.index.astype(‘Int64’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how to user lists of keys and values to generate a dict?

A

new_dict = dict(zip(key, values))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to drop the first row of a pandas df if the index is NaN?

A
# Drop the first row
df = df.drop(index=pd.NA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to count the number of atendents if antendent id is index column in a df?

A

len(df.index)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how to get all rows from df where gender not male or female

A

df.loc[~df['gender'].isin(['Male', 'Female'])]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to substitute all cells where True with ‘no’ ?

A

df.replace(True, ‘no’, inplace=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to select only col1 and col2 for df filtered with boolean condition bc?

A

df.loc[bc, ['col1', 'col2'] ]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how to count the rows of a df (or df selection)

A

df.shape[0]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

count the number of rows for each different value of ‘location’ in df

A

df[‘location’].value_counts()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does pd.cut do?

A

pandas.cut(x, bins) splits the Array or Serie x in bins categories. bins can also be a list specifying the bin edges

17
Q

what does pd.qcut do?

A

pandas.qcut(x, q) It divides a dataset into quantiles, which are intervals with approximately the same number of observations.