Python (UBC) Flashcards
What function calls Pandas and what is Pandas typically abbreviated as?
> > > import pandas as pd
What function converts a .csv into a Pandas Dataframe? Use Dingo.csv in the example code
> > > Dingo = pd.read_csv(‘Dingo.csv’)
How would you make python show the dataframe ‘Dingo’? (Assume ‘Dingo’ is defined). How many observations of this dataframe are shown as default?
> > > Dingo
(inputting the object’s name will call it).
assuming Dingo contains more than 10 observations, Python will show the first 5 and last 5 of the data set (unless otherwise instructed).
Which of distinguish which of the following code snippets is an object, an attribute, a function, and an argument:
»> Dingo = pd.read_csv(‘Dingo.csv’)
»> Dingo.shape
# Dingo = pd.read_csv('Dingo.csv') # Dingo.shape
Dingo is an object in both cases, pd.read_csv is a function (tell-tale sign is the brackets), ‘Dingo.csv’, is an argument, and shape is an attribute.
What attribute would you use to find out the variables of the dataframe ‘Dingo’. Provide the code required to answer the question
> > > Dingo.columns
N.B. Vertical columns are variables, Horizontal rows are observations
Let’s assume Dingo.shape returns (30, 5). What function is would allow us to look at the first five variables, of only the observations 10-19? Assume that the fifth variable is ‘location’
> > > Dingo.loc[10:19, :’location’]
N.B. df.loc[‘x start’:’x end’, ‘y start’:’y end’] and that not stating a start or end tells Python to use the terminus of the data. I.e. […, :’location’] is all columns left of the location as well as the ‘location’ column
Let’s assume Dingo.shape returns (30, 5), and that one variable is ‘location’. What are three ways we could display only the ‘location’ column?
> > > Dingo.loc[:, [‘location’]]
> > > Dingo.iloc[:, [5]]
> > > Dingo[[‘location’]
Let’s assume Dingo.shape returns (30, 5), and that one variable is ‘Nutterness’ (a percentile stored as a float). What function would allow us to sort from most Nutter to least Nutter within this dataframe
> > > Dingo.sorted_values(by=’Nutterness’, ascending=False)
N.B. that the sorted_values attribute defaults to ascending
What function would you use to show the basics statistics of a dataframe? Provide the code required to answer the question using the Dingo dataframe
> > > Dingo.describe()
What function would you use to show the basics statistics AND the categorical summaries of a dataframe? Provide the code required to answer the question using the Dingo dataframe
> > > Dingo.describe(include=’all’)
What’s the attribute for displaying the count or frequency of an object?
> > > object.value_counts()
Why is the following example incorrect?
»> Dingo_location = Dingo[[‘location’]]
»> Dingo_location.value_counts()
Assume the Dingo is defined, and contains the column ‘location’
For whatever reason the value_counts function can’t pair with the column lookup with two square brackets ‘[[…]]’, instead it uses a single, so it should look like this:
> > > Dingo_location = Dingo[‘location’]
Dingo_location.value_counts()
What is the function to export a csv?
> > > Dataframe.to_csv(‘desired filename’, index=False)
N.B. That Index=False omits the index column from being included in the csv