Pandas Flashcards
Axis?
Axis 0 = rows
Axis 1 = columns
Reorder columns in dataframe
Option 1)
Option 2)
1) df.sort_index(axis=1)
2) df.reindex(columns=sorted(df.columms))
Explore document
1) first rows
2)
3)
Df.head()
Df.info()
Df.describe()
Reorder dataframe
1) by index
2) by a particular column
1) df.sort_index()
2) df.sort_values(by=column)
Remove columns from dataframe
Df.drop([col1, col2], axis=1)
Filter dataframe by a value in a column?
Df[df[‘column’]<5]
Filter dataframe by several conditions
Df[df[‘column’] >0 and df[‘column] == ‘Berlin’]
Filter column in df by list of values
List = [‘one’, ‘two’]
Df[df[‘col’].isin(list)]
Filter column values by contains
Df[df[‘col’].str.contains(pattern)]
Unique values in columm
Df[‘col’].unique()
Filter column values by does not contain?
Df[df[‘col’].str.contains(‘blabla’)]
Join dataframes
1) merge
2) join
1)
Pd.merge(left, right, how=’inner’,on=None,left_on=None, right_on=None
…)
Concatenate datasets
Pd.concat([df1,df2],axis=0)
Create new columns based on existing columns?
1) 2 conditions
2) 2 conditions
1) ifelse
Np.where(condition, ‘yes’, ‘no’)
2) ternary expression
Df[‘col’] = df[‘number’].apply(lambda x:
‘more than 5’ if x > 5 else ‘5 or less’)
Create column based on existing column
1) one column, 3+ conditioma
Df[‘ncol’] = df[‘oldcol’].apply(function)
Create new column based on multiple columns
Df[‘newcol’] = df.apply(lambda row: function(row), axis=1)
Convert x to string
1)
2)
.astype(str)
Str()
Check missing values in column
Isnull().any()