Pandas Flashcards
Explain what a series is in Pandas?
A 1d indexed array object containing elements of one type.
How would you create a series from a list, dictionary? How do you specify the index?
pd. Series([‘a’,’b’,’c’], index=[1,11,111])
pd. Series({1 : ‘a’, 11 : ‘b’, 111 : ‘c’})
How do you get the values, index, shape and data type from a series?
s. values
s. index
s. shape
s. dtype
What is the underlying data structure of a Pandas Series?
numpy arrays
What is the difference between a DataFrame and a Series?
A dataframe contains multiple series.
Construct a DataFrame from a list, series, list of lists, list of dictionaries, dictionary of list values, dictionary of series
pd.DataFrame(…
[‘a’,’b’,’c’]
[[‘a’, 1, True], [‘b’, 2, False]]
[{‘a’ : 1, ‘b’ : 2}, {‘a’ : 11, ‘b’ : 22}]
{‘a’ : [1,2,3], ‘b’ : [True, True, False]}
{‘a’ : s1, ‘b’ : s2}
)
How do you get the index, columns, values, shape from a DataFrame?
df. index
df. columns
df. index
df. values
df. shape
How would you create a pandas index object?
pd.Index([1,2,3])
What is the difference between .loc and .iloc?
.loc[ : . : ] indexed by index and column values, slicing is inclusive
.iloc[ : , : ] indexed by position in df.index/df.columns, slicing is exclusive upper.
How do you do Boolean Indexing?
df[mask]
where mask is a series of bools with same shape as a df.column
How do you select just the desired columns in pandas?
df[list of cols]
What is a ufunc? Why should you use them?
Universal functions can be efficiently performed on arrays (index-aligned operations). Should be used where possible.
What is ‘broadcasting’ in pandas?
series + 3 is broadcast to series + pd.Series([3,3,…])
How do you apply a function element-wise to a series or DataFrame?
df.apply(func/lambda)
Give two ways to handle missing data, explain when you might use either
df. fillna(value, method=’bfill’/’ffill’)
df. fillna(df.interpolate()/df.mean())
df. dropna(axis=0 or 1)