Pandas Flashcards

Question 1

Q

If you don’t pass an item to the index parameter and a dictionary is given to the data parameter,

Answer

A

then Pandas will use the dictionary keys as index labels

Question 2

Q

How to import pandas and check the version?

Answer

A

print(pd.__version__)

print(pd.show_versions(as_json=True))

Question 3

Q

Create a pandas series from each of the items below: a list, numpy and a dictionary

Answer

A

pd.Series(item)

Question 4

Q

what does zip(*iterables) do ?

Answer

A

The function takes in iterables as arguments and returns an iterator. This iterator generates a series of tuples containing elements from each iterable.

Question 5

Q

Convert the series ser into a dataframe with its index as another column on the dataframe

Answer

A

df = ser.to_frame().reset_index()

Question 6

Q

Combine ser1 and ser2 to form a dataframe.

Answer

A

df = pd.concat([ser1, ser2], axis=1)
df = pd.DataFrame({'col1': ser1, 'col2': ser2})

Question 7

Q

Give a name to the series ser calling it ‘alphabets’.

Answer

A

ser.name = ‘alphabets’

Question 8

Q

From ser1 remove items present in ser2.

Answer

A

ser1[~ser1.isin(ser2)]

Question 9

Q

Get all items of ser1 and ser2 not common to both.

Answer

A

ser_u = pd.Series(np.union1d(ser1, ser2)) # union
ser_i = pd.Series(np.intersect1d(ser1, ser2)) # intersect
ser_u[~ser_u.isin(ser_i)

Question 10

Q

Compute the minimum, 25th percentile, median, 75th, and maximum of ser.

ser = pd.Series(np.random.normal(10, 5, 25))

Answer

A

np.percentile(ser, q=[0, 25, 50, 75, 100])

Question 11

Q

Calculte the frequency counts of each unique value ser.

Answer

A

ser = pd.Series(np.take(list(‘abcdefgh’), np.random.randint(8, size=30)))
ser.value_counts()

Question 12

Q

From ser, keep the top 2 most frequent items as it is and replace everything else as ‘Other’.

Answer

A

ser[~ser.isin(ser.value_counts().index[:2])] = ‘Other’

Question 13

Q

Bin the series ser into 10 equal deciles and replace the values with the bin name.

Answer

A

pd.qcut(ser, q=[0, .10, .20, .3, .4, .5, .6, .7, .8, .9, 1],
labels=[‘1st’, ‘2nd’, ‘3rd’, ‘4th’, ‘5th’, ‘6th’, ‘7th’, ‘8th’, ‘9th’, ‘10th’])

Pandas Flashcards

(13 cards)