PANDAS Flashcards
TIME SERIES
sports = {‘Archery’: ‘Bhutan’,
‘Golf’: ‘Scotland’,
‘Sumo’: ‘Japan’,
‘Taekwondo’: ‘South Korea’}
Can you create a time series from a dictionary? How do you do it?
#import pandas and use pd.Series(Dict) import pandas as pd s = pd.Series(sports)
SERIES QUERYING
sports = {‘Archery’: ‘Bhutan’,
‘Golf’: ‘Scotland’,
‘Sumo’: ‘Japan’,
‘Taekwondo’: ‘South Korea’}
s = pd.Series(sports)
a) Retrieve the value associated to index label “Sumo”
b) Retrieve the for the item at idex position 2
# Use s.loc[ ] and s.iloc[ ] print(s.loc["Golf"]) print(s.iloc[2])
SHOWING A SAMPLE FROM SERIES
Create a series with a thoundsand random integers, then print the first an last elements in the series.
#use numpy.random.rand() import pandas as pd import numpy as np
s = pd.Series(np.random.randint(0,1000,1000))
print(“First elements”)
print(s.head())
print(“Last elements”)
print(s.tail())
REPEATED INDEXES
original_sports = pd.Series({‘Archery’: ‘Bhutan’,
‘Golf’: ‘Scotland’,
‘Sumo’: ‘Japan’,
‘Taekwondo’: ‘South Korea’})
cricket_loving_countries = pd.Series([‘Australia’,
‘Barbados’,
‘Pakistan’,
‘England’],
index=[‘Cricket’,
‘Cricket’,
‘Cricket’,
‘Cricket’])
full_series= original_sports.append(cricket_loving_countries)
How can you return all the cricket_loving countries?
Use the .iloc[ ] property
cricket_countries = full_series.loc[“Cricket”]
INDEX AND COLUMN LABELS
How can you retrieve the labels of the index and columns of a data frame?
# Use .index and .colums properties idx = df.index col = df.columns
DATA QUERYING
Explain the workings of .loc[ ] and .iloc[ ] properties
df. loc[ ] -> label based querying
df. iloc[ ] -> index based querying