Pandas Flashcards

1
Q

To merge csv tables from scrape in Pandas

A

DataframeName = pd.concat([df, df2, df3], axis=1, sort=False)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

To find if there are duplicate rows

A

df.column name.duplicated()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Displays the duplicate rows

A

df.loc[df.duplicated(), :]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

To mark duplicates except for the first occurrence

A

df.loc[df.duplicated(keep = ‘first’), :]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

To mark duplicates except for the last occurrence

A

df.loc[df.duplicated(keep = ‘last’), :]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To mark all duplicates as True (all will be displayed)

A

df.loc[df.duplicated(keep = False), :]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

To drop duplicates from the data frame

A

df.drop_duplicates(keep=’first’).shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Save file in Numpy

Load file in Numpy

A

Arr = np.arrange(10)

np. save(‘file_name’, arr)
np. load(‘file_name.npy’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Drop row(s)

A

DataframeName.drop([‘row name’, ‘row name’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Transpose Datafram (swap rows and columns)

A

DataframeName.T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Add 2 dfs, and keep values where rows and columns dont match.

A

df1.add(df2, fill_value = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Creating DF with 12 count, 4 rows, 3 columns, A, B,C as column nanes, and 4 states as index.

A

df = pd.Dataframe(np.arrange(12.).reshape ((4,3), columns = list (‘ABC’), index = [‘New York’, ‘Florida’, ‘California’, ‘Nevada’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Create own dataframe

A

df = pd.dataframe({‘A’ : [0, 1, 2, 3, 4], ‘B’: [5, 6, 7, 8], ‘C’: [9, 10, 11, 12]})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Load csv into Pandas and create header row

A

“pd.read_csv(‘examples/ex2.csv’, names=[‘a’, ‘b’, ‘c’, ‘d’, ‘message’])”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

To check for duplicate rows

A

data.duplicated()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

To drop duplicates

A

data.drop_duplicates()

17
Q

Add mean/average column for averages

A

DataframeName[‘Mean’] = DataframeName.mean(numeric_only=True, axis =1)

18
Q

Surpressing scientific notation

A

pd.set_option(‘display.float_format’, ‘‘.format)

19
Q

Supress scientific notation and format with dollar sign and commas

A

pd.set_option(‘display.float_format’, ‘${:,.2f}’.format)

20
Q

To view rows or select specific row and all columns

A

DataFrameName.loc[‘Row_Name’, :]
loc.[what rows do i want, what column do I want]
: = all columns

21
Q

To view multiple rows

A

DataFrameName.loc[[‘Row_Name’, ‘Row_Name’, ‘Row_Name’ :]]

22
Q

To select a column and all rows

A

df9.loc[:, ‘column_name’ ]

23
Q

To select multiple columns and all rows

A

df9.loc[:, [‘column_name.’, ‘column_name’]]

24
Q

To arrange list into array

A

ListName.reshape((#of rows, # of columns))