pandas Flashcards
import pandas library
import pandas as pd
create a hardcoded dataframe
dfHardCoded = pd.DataFrame( # records [ ['Jan', 23, 62], ['Feb', 11, 50], ['Mar', 40, 45], ['Apr', 22, 26], ['May', 40, 60], ['Jun', 10, 62] ], # indices index = [ 0, 1, 2, 3, 4, 5], # column headers columns = [ "month", "lowest", "highest"] )
read data.csv. It is a TSV
df = pd.read_csv(“data.csv”, sep=”\t”)
descriptive statistics table for dataframe “df”
df.describe()
first 5 elements of “df”
df.head()
last 10 elements of “df”
df.tail(10)
datatypes of “df”
df.dtypes
information about the indices of records of “df”
df.index
values of “df” (i.e. records) as a numpy array
df.values
sort values of “df” based on feature “color”, descending
df.sort(“color’”, ascending=False)
return “color” column of dataframe as a pandas series object
df[“color”]
return “color” and “price” columns of dataframe as a pandas dataframe object
df[“color”, “price”]
return a range of records between indices 13-50
df[13:51]
return “color” at indices 2 and 3
df.loc[2:4,[“color”]]
return price at indices 3 and 4. You don’t know the exact name of the price column, but you know it’s index is 7 in the array of columns
df.iloc[3:5,[7]]