Pandas Flashcards

1
Q

If you don’t pass an item to the index parameter and a dictionary is given to the data parameter,

A

then Pandas will use the dictionary keys as index labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to import pandas and check the version?

A

print(pd.__version__)

print(pd.show_versions(as_json=True))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Create a pandas series from each of the items below: a list, numpy and a dictionary

A

pd.Series(item)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does zip(*iterables) do ?

A

The function takes in iterables as arguments and returns an iterator. This iterator generates a series of tuples containing elements from each iterable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Convert the series ser into a dataframe with its index as another column on the dataframe

A

df = ser.to_frame().reset_index()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Combine ser1 and ser2 to form a dataframe.

A
df = pd.concat([ser1, ser2], axis=1)
df = pd.DataFrame({'col1': ser1, 'col2': ser2})
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give a name to the series ser calling it ‘alphabets’.

A

ser.name = ‘alphabets’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

From ser1 remove items present in ser2.

A

ser1[~ser1.isin(ser2)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Get all items of ser1 and ser2 not common to both.

A

ser_u = pd.Series(np.union1d(ser1, ser2)) # union
ser_i = pd.Series(np.intersect1d(ser1, ser2)) # intersect
ser_u[~ser_u.isin(ser_i)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Compute the minimum, 25th percentile, median, 75th, and maximum of ser.

ser = pd.Series(np.random.normal(10, 5, 25))

A

np.percentile(ser, q=[0, 25, 50, 75, 100])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Calculte the frequency counts of each unique value ser.

A

ser = pd.Series(np.take(list(‘abcdefgh’), np.random.randint(8, size=30)))
ser.value_counts()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

From ser, keep the top 2 most frequent items as it is and replace everything else as ‘Other’.

A

ser[~ser.isin(ser.value_counts().index[:2])] = ‘Other’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bin the series ser into 10 equal deciles and replace the values with the bin name.

A

pd.qcut(ser, q=[0, .10, .20, .3, .4, .5, .6, .7, .8, .9, 1],
labels=[‘1st’, ‘2nd’, ‘3rd’, ‘4th’, ‘5th’, ‘6th’, ‘7th’, ‘8th’, ‘9th’, ‘10th’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly