pandas Flashcards
import pandas library
import pandas as pd
create a series in pandas
s = pd.Series([2,3,4,3], index=[‘a’,’b’,’c’,’d’])
create a dataFrame in pandas
a = pd.DataFrame({‘a’ : [1,2,3], ‘b’ : [3,4,5]}, columns=[‘a’,’b’])
read in csv in panadas
df = pd.read_csv(‘file.csv’)
write dataframe to csv file
df.to_csv(‘/filepath/file.csv’)
get value at a certain location given indices of row x col
a.iloc[[row], [col]]
get value at a certain location given index and column names
a.loc[[row_name],[col_name]]
Get values from series that are > 2
s[(s > 2)]
get values from series equal to 2 or 3
s[(s == 3) | (s ==4)]
get populations greater than 1million
df[df[‘population’] > 10000000]
set series index a = 5
s[‘a’] = 5
drop rows ‘a’ and ‘c’
s.drop([‘a’,’c’])
drop column ‘population’
s.drop(‘population’, axis=1)
sort the dataframe by population size
df.sort_values(by=’Population’)
rank matrix positions
df.rank()