Pandas Flashcards
Describe what is the pandas’ Series class
One-dimensional array-like object that it used stored a sequence of values (ie. data) and an associate array of corresponding data labels, called its index.
Does a pandas Series have both row and column labels?
No, Series just have row labels (a.k.a. index). Since a pandas Series is a 1-dimensional array-like object, there is only 1 dimensional, hence there is no need for column labels.
Does a pandas DataFrame have both row and column labels?
Yes, a DataFrame has both labels since it is used to represent tabular data, ie data that is organized according to both row and colums.
How to create a Dataframe from a dict of equal-length? What will happen with the dict’ keys?
Pass the data dict to the Dataframe() function, ie. pd.DataFrame(dict).
data = {
‘state’: [‘Ohio’, ‘Ohio’, ‘Ohio’, ‘Nevada’, ‘Nevada’, ‘Nevada’],
‘year’: [2000, 2001, 2002, 2001, 2002, 2003],
‘pop’: [1.5, 1.7, 3.6, 2.4, 2.9, 3.2]
}
Keys will become the column labels
How to create a Dataframe from a nested dict? What will happen with the dict’ keys?
Pass the data dict to the Dataframe() function, ie. pd.DataFrame(dict).
pop = {
‘Nevada’: {2001: 2.4, 2002: 2.9},
‘Ohio’: {2000: 1.5, 2001: 1.7, 2002: 3.6}
}
The outer keys will become the column labels, while the inner keys will become the row labels (a.k.a. the index).
How to create a Series from a dict? What will happen with the dict’ keys?
Pass the data dict to the Series() function, ie. pd.Series(dict). dict = { 'Ohio': 35000, 'Texas': 71000, 'Oregon': 16000, 'Utah': 5000 } Keys will become the row labels (a.k.a. the index).
How to check if a particular row label is include in the index of a Series?
Series keep some similarities with dicts, one of them being that I can query if a given value is within the index using the IN keyword.
For example:
IN: ‘b’ IN object1 # object1 being a Series
How to check if a particular column label is include among the existing column labels of a DataFrame?
DataFrame keep some similarities with dicts, one of them being that I can query if a given value is within the index using the IN keyword.
For example:
IN: ‘b’ IN object1 # object1 being a DataFrame
Whenever I index/slice a Series or Dataframe, what is the output and how it relates to the original Series/Dataframe?
Slicing/indexing a Series/Dataframe creates a VIEW and that means that any changes made to the view will propagate back to the original Series/Dataframe
One useful Series feature is that it automatically aligns by index label in arithmetic operations. Provide an example of how this works in practice
Imagine the following dict called pop: pop = { 'Ohio': 500, 'Texas': 1000, 'Oregon': 1500, 'Utah': 2000 } If i perform the following operation pop + pop, the result will be: pop = { 'Ohio': 1000, 'Texas': 2000, 'Oregon': 3000, 'Utah': 4000 } On both Series there are the 'Ohio' indice, hence the values associated with it were aligned and added.
What is the index attribute from a Series/Dataframe? How to visualize it?
It corresponds to the labels of the rows.
IN: pd..index
IN: pd..index
What is the columns attribute (a.k.a column labels) from a Dataframe? How to visualize it?
It corresponds to the headers of the columns.
IN: pd..columns
How to add a new column to a DataFrame?
By assigning a column that currently doesn’t exist in the DataFrame, one can create a new column
How to delete a column from a DataFrame using the del method?
IN: del [
Describe what is the pandas’ DataFrame class
DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. Along with the data, you can optionally pass index (row labels) and columns (column labels) arguments