Python Pandas Flashcards
Create a dictionary called my_dict with the following three key value pairs:
key ‘country’ and value names.
key ‘drives_right’ and value dr.
key ‘cars_per_cap’ and value cpc.
my_dict={‘country’: names, ‘drives_right’: dr, ‘cars_per_cap’: cpc}
Use pd.read_csv() to import cars.csv data as a DataFrame. Store this DataFrame as cars.
cars = pd.read_csv(‘cars.csv’)
Specify the index_col argument inside pd.read_csv(): set it to 0, so that the first column is used as row labels.
cars = pd.read_csv(‘cars.csv’, index_col=0)
Use single square brackets to print out the ‘country’ column of cars as a Pandas Series.
print(cars[‘country’])
Use double square brackets to print out the ‘country’ column of cars as a Pandas DataFrame.
print(cars[[‘country’]])
Use double square brackets to print out a DataFrame with both the ‘country’ and ‘drives_right’ columns of cars, in this order.
print(cars[[‘country’, ‘drives_right’]])
Select the first 3 observations from cars and print them out.
print(cars[0:3])
Select the fourth, fifth and sixth observation, corresponding to row indexes 3, 4 and 5, and print them out.
print(cars[3:6])
3 ways to inspect a dataframe
Inspect the first few rows (including index labels)
print(df.head())
Inspect the last few rows
print(df.tail())
Inspect random sample rows
print(df.sample(5))
Use loc or iloc to select the observation corresponding to Japan as a Series. The label of this row is JPN, the index is 2. Make sure to print the resulting Series.
print(cars.loc[‘JPN’])
print(cars.iloc[2])
Print out the ‘drives_right’ value of the row corresponding to Morocco (its row label is MOR)
print(cars.loc[‘MOR’, ‘drives_right’])
Print out a sub-DataFrame, containing the observations for Russia and Morocco and the columns ‘country’ and ‘drives_right’.
print(cars.loc[[‘RU’, ‘MOR’], [‘country’, ‘drives_right’]])
Print out from the df cars the drives_right column as a Series using loc
print(cars.loc[:,’drives_right’])
Print out the drives_right column as a DataFrame using loc
print(cars.loc[:, [‘drives_right’]])
Print out both the cars_per_cap and drives_right column as a DataFrame using loc
print(cars.loc[:, [‘cars_per_cap’, ‘drives_right’]])