pandas Flashcards
what is loc?
it selects rows by values. for example: elections.loc[0:4] returns the first 4 rows in the election table, as they are rows 0 to rows 4 end inclusive.
what is the tail function?
returns the last n of table: elections.tail(2) returns the last 2 rows in the elections table.
how can you use loc for column names?
example: elections.loc[0:4, “year” : “party”]
what is iloc?
selects values by number. example: elections.iloc[[1, 2, 3][0,1,2]] returns the 2nd 3rd and 4th columns then the 1st 2nd and 3rd columns
how do loc and iloc differ?
loc gets rows (and/or columns) with particular labels. iloc gets rows (and/or columns) at integer locations.
how do we get a datafram out of our loc or iloc function?
use to frame:
elections[“candidate”].tail(5).to_frame()
what is isin?
isin helps us find the values we are looking for in the data frame:
wanted = [“Anti-Masonic”, “American”, “Anti-Monopoly”, “American Independent”]
elections[elections[“Party”].isin(wanted)]
what is str.strartswith?
str.startswith helps us find the values in a specific column that start with the specified letter:
elections[elections[“Party”].str.startswith(“A”)]
how do we use query?
query lets you choose when values equal or are more/less than speific numbers, or when values = win, i.e. a specific value.
elections.query(‘Year >= 2000 and Result == “win”’)
what does .describe() return?
std, mean, etc.
what is value_counts()?
value_counts() retunrs the # of occurences of unique values;
what is unique()?
returns every unique value in the specified column:
elections[‘party’].unique()
what is sort_values()?
this function sorts values in word columns A-Z without specification, can sort numbers as well, like in column ‘%’:
elections.sort_values(“%”, ascending = False)
what is str.len()?
a funcion that returns the length of the values in a specified column.
babyname_lengths = babynames[“Name”].str.len()
how do we drop a column?
babynames = babynames.drop(“name_lengths”, axis = ‘columns’)