Pandas Flashcards

1
Q

Drop Columns

A

df.drop(columns=[‘Column1’, ‘Column2’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Pivots

A

df.pivot(columns=’var’, values=’val’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sort

A

df.sort_values(‘column1’)

Order rows by values of a column (low to high).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Rename Columns

A

df.rename(columns = {‘y’:’year’})

Rename the columns of a DataFrame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Head

A

df.head(n)

Select first n rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Tail

A

df.tail(n)

Select last n rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Using Query

A
query() allows Boolean expressions for filtering
rows.
df.query('Length > 7')
df.query('Length > 7 and Width < 8')
df.query('Name.str.startswith("abc")',
engine="python")
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Select rows 10-20.

A

df.iloc[10:20]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Select columns in positions 1, 2 and 5 (first

column is 0).

A

df.iloc[:, [1, 2, 5]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Access single value by index

A

df.iat[1, 2]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Access single value by label

A

df.at[4, ‘A’]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Select rows meeting logical condition, and only the specific columns .

A

df.loc[df[‘a’] > 10, [‘a’, ‘c’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Append rows of DataFrames

A

pd.concat([df1,df2])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Append columns of DataFrames

A

pd.concat([df1,df2], axis=1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Gather columns into rows.

A

pd.melt(df)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Logic in Python (and pandas)

A
< Less than 
!= Not equal to
> Greater than 
df.column.isin(values)  Group membership
== Equals 
pd.isnull(obj) Is NaN
<= Less than or equals pd.notnull(obj) Is not NaN
>= Greater than or equals &,|,~,^,df.any(),df.all() Logical and, or, not, xor, any, all
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Count number of rows with each unique value of variable

A

df[‘w’].value_counts()

18
Q

Tuple of # of rows, # of columns in DataFrame.

A

df.shape

19
Q

of distinct values in a column.

A

df[‘w’].nunique()

20
Q

Basic descriptive and statistics for each column (or GroupBy).

A

df.describe()

21
Q

Drop rows with any column having NA/null data.

A

df.dropna()

22
Q

Replace all NA/null data with value

A

df.fillna(value)

23
Q

Compute and append one or more new columns.

A

df.assign(Area=lambda df: df.Length*df.Height)

24
Q

Return a GroupBy object, grouped

by values in column named “col”

A

df.groupby(by=”col”)

size()
Size of each group.
agg(function)
Aggregate group using function

25
Q

Histogram for each column

A

df.plot.hist()

26
Q

Scatter chart using pairs of points

A

df.plot.scatter(x=’w’,y=’h’)

27
Q

Merge:

Join matching rows from bdf to adf.

A

pd.merge(adf, bdf,

how=’left’, on=’x1’)

28
Q

Merge:

Join matching rows from adf to bdf. (Right join)

A

pd.merge(adf, bdf,

how=’right’, on=’x1’)

29
Q

Merge:

Join data. Retain only rows in both sets.

A

pd.merge(adf, bdf,

how=’inner’, on=’x1’)

30
Q

Merge:

Join data. Retain all values, all rows.

A

pd.merge(adf, bdf,

how=’outer’, on=’x1’)

31
Q

Filtering Joins:

All rows in adf that have a match in bdf.

A

adf[adf.x1.isin(bdf.x1)]

32
Q

Filtering Joins:

All rows in adf that do not have a match in bdf.

A

adf[~adf.x1.isin(bdf.x1)]

33
Q

Set-like Operations:

A

pd.merge(ydf, zdf)
Rows that appear in both ydf and zdf
(Intersection).

pd.merge(ydf, zdf, how=’outer’)
Rows that appear in either or both ydf and zdf
(Union).

pd.merge(ydf, zdf, how='outer',
indicator=True)
.query('_merge == "left_only"')
.drop(columns=['_merge'])
Rows that appear in ydf but not zdf (Setdiff).
34
Q

Creating Dataframes:

Specify values for each column.

A
df = pd.DataFrame(
{"a" : [4, 5, 6],
"b" : [7, 8, 9],
"c" : [10, 11, 12]},
index = [1, 2, 3])
35
Q

Creating Dataframes:

Specify values for each row.

A
df = pd.DataFrame(
[[4, 7, 10],
[5, 8, 11],
[6, 9, 12]],
index=[1, 2, 3],
columns=['a', 'b', 'c'])
36
Q

Creating Dataframes:

Create DataFrame with a MultiIndex

A
df = pd.DataFrame(
{"a" : [4 ,5, 6],
"b" : [7, 8, 9],
"c" : [10, 11, 12]},
index = pd.MultiIndex.from_tuples(
[('d’, 1), ('d’, 2),
('e’, 2)], names=['n’, 'v']))
37
Q

Read csv with Pandas

A

pd.read_csv(“ruta.csv”, index =FALSE)

38
Q

Alternative way of creatina a Pivot table

A

pd.pivot_table(df, values= 0, index=[‘col 1’], columns=[‘col2’], aggfunc =np.sum)

39
Q

Save df as csv

A

df.to_csv(“filename”, index=False)

40
Q

Save as XLSX

A

with pd.ExcelWriter(“file_name”) as writer:

df. to_excel(writer,sheet_name=“name”,index =false)
df2. to_excel(writer, sheet_name=“name2”, index=false)