Pandas Flashcards

Review key Pandas concepts

1
Q

What are the commands to load and save cvs files?

A

pd.read_csv(path) and pd.to_csv(path, index = False)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name the three methods to combine DataFrames.

A

Pandas.concat([DataFrame1, DataFrame2], axis=0/1)
Pandas.merge([DataFrame1, DataFrame2], how=’outer’/’inner’/’right’/’left’, on=column_name)
DataFrame1.join(DataFrame2, how=’outer’/’inner’/’right’/’left’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the key difference (other than syntax) between merge and join?

A

merge requires a column to merge.
join combines data based on index.
(note: you probably can force a column to be an index value when creating a DataFrame)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the inplace=True/False parameter do?

A

It forces the changes made to the DataFrame rather than returning a value and leaving the original untouched (if set to true; default is false)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when axis parameter is called, what is the x value and what is the y value?

A

x = 0
y = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the difference between the and iloc methods.

A

loc requires index labels, meaning the names of rows and columns (which can be numbers, but don’t have to be). iloc requires the integer values of those indexes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What would DataFrame.iloc[[1,2], [‘date’, ‘stock’]] return?

A

Index Error. The iloc method requires the integer values of the index labels, and will error if the labels are given.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the groupby() method

A

DataFrame.groupby() function is used to group occurances of common values in a particular column and can further split the data of another column based on some criterial (like mean, median, etc.).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What method can you use to execute a custom function across the entirety of a DataFrame?

A

DataFrame[column_name].apply(function)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does strftime stand for

A

String Format Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe how to format a loc call.

A

Take the DataFrame with a ‘.’ behind it and then put your parameters in [] after the loc call. Within the [], there should be other [] with values. The first set of [] will have the row values listed, separated by a , between each value. the second [] will have the same, but with columns.
i.e. DataFrame.loc[[row1, row2, row3],[‘col1’,’col2’,’col3’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Write an example of a conditional DataFrame call

A

DataFrame.loc[DataFrame[‘column’]>x]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Execute the .value_counts() function on the ‘Name’ column of a DataFrame. Return the respective counts of each distinct value in relation to the whole set.

A

DataFrame[‘Name’].value_counts(normalize=True)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Extract month from the date column in the data DataFrame.

A

data[‘date’].dt.month

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain what the .map() function does.

A

.map() function will map values produced by a function, dictionary, or series to their appropriate counterparts in the series the .map() is transforming.
import pandas as pd

Sample Series
s = pd.Series([‘apple’, ‘banana’, ‘cherry’, ‘date’])

Mapping dictionary
fruit_codes = {‘apple’: 1, ‘banana’: 2, ‘cherry’: 3}

Apply the mapping
coded_s = s.map(fruit_codes)

print(coded_s)
# Expected output
# 0 1.0
# 1 2.0
# 2 3.0
# 3 NaN
# dtype: float64

How well did you know this?
1
Not at all
2
3
4
5
Perfectly