Pandas Flashcards
Pandas
python library
easy to load and manipulate data
integrates with loads of analysis and visualization libraries
dataframes
specific structure for two-dimensional data
pd.Dataframe()
What method do you use to sort by a series in pandas?
sort_values(by = ‘series’ (this can be a list so you can sort by more than one column),)
which method to list the count of unique values in a series
value_counts()
List the breed of the cat with the highest price
cat_df[‘breed’,’price’].max()
List info on cats with breed tabby, calico, or bengal
cat_df.loc[(cat_df[‘breed’] == ‘calico’) | (cat_df[‘breed’] == ‘tabby’) | (cat_df[‘breed’] == ‘bengal’)]
In a dataset with car company’s and price find each company’s highest priced car
price = new[‘company’, ‘price’].max()
how to do a join on pandas ex) on column ‘cat’
pd.merge(df2, df2, on = ‘cat’)
how to create a dataframe from a dictionary
pd.DataFrame.From_dict()
method to get rid of duplicates
pd.drop_duplicates
All the unique items
df.unique() or nunique for number of unique items
combine datasets
make a list of dataframes then df.concat()
in a cat dataframe how to get all info where cat breed = ‘tabby’ or ‘calico
df.loc[(df[breed] == tabby) | (df[breed] == calico)]
if only one condition no need for parentheses
add a row to dataframe
df.append(dataframe or list)
iterate through a dataframe
df.iterrows() will return index, row of dataframe ex) for i, row in df.itterrows(): wins = row[w] games = row[g] win perc = win/games_played list.append(winperc)
df[winspercentaget) = list