Pandas Flashcards

1
Q

Pandas

A

python library
easy to load and manipulate data
integrates with loads of analysis and visualization libraries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

dataframes

A

specific structure for two-dimensional data

pd.Dataframe()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What method do you use to sort by a series in pandas?

A

sort_values(by = ‘series’ (this can be a list so you can sort by more than one column),)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

which method to list the count of unique values in a series

A

value_counts()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

List the breed of the cat with the highest price

A

cat_df[‘breed’,’price’].max()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List info on cats with breed tabby, calico, or bengal

A

cat_df.loc[(cat_df[‘breed’] == ‘calico’) | (cat_df[‘breed’] == ‘tabby’) | (cat_df[‘breed’] == ‘bengal’)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In a dataset with car company’s and price find each company’s highest priced car

A

price = new[‘company’, ‘price’].max()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how to do a join on pandas ex) on column ‘cat’

A

pd.merge(df2, df2, on = ‘cat’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to create a dataframe from a dictionary

A

pd.DataFrame.From_dict()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

method to get rid of duplicates

A

pd.drop_duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

All the unique items

A

df.unique() or nunique for number of unique items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

combine datasets

A

make a list of dataframes then df.concat()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

in a cat dataframe how to get all info where cat breed = ‘tabby’ or ‘calico

A

df.loc[(df[breed] == tabby) | (df[breed] == calico)]

if only one condition no need for parentheses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

add a row to dataframe

A

df.append(dataframe or list)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

iterate through a dataframe

A
df.iterrows() will return index, row of dataframe
ex)
for i, row in df.itterrows():
wins = row[w]
games = row[g]
win perc = win/games_played
list.append(winperc)

df[winspercentaget) = list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

grab columns as numpy arrays

A

use .values
runn_diffs_np = baseball_df[run].values - baseball_df[‘ra].values
will give you a numpy array that you can append to the dataframe

17
Q

identify na values when creating dataframe

A

na_values argument in read_csv

18
Q

replace na values

A

df.fillna()
df.fillna(0) - will fill with zeroes
values = {“A”: 0, “B”: 1, “C”: 2, “D”: 3}
df.fillna(value=values) - will fill columns with appropriate na values
df.fillna(value=values, limit=1) - will only replace first nan value