Basic Python Flashcards

Question 1

Q

Write a lambda function that adds a column to the dataframe ‘shoes’ that returns Not Vegan if the shoe is made out of leather and Vegan if it is not.

Answer

A

df[‘vegan’] = df.shoe_type.apply(lambda x: ‘Not_Vegan’ if x == ‘Leather’ else ‘Vegan’)

Question 2

Q

Use the columns ‘gender’ and ‘last_name’ from ‘df’ to make a new column called ‘salutation’ that returns ‘Dear Mr.’ and the last name if gender is male or ‘Dear Ms.’ and last name if female.

Answer

A

df[‘salutations’] = df.apply(lambda row: “Dear Mr. “ + row.last_name if row.gender == ‘male’ else “Dear Ms. “, axis = 1)

Question 3

Q

Get the last name only from a column that has the format “Skye Long” called “name”

Answer

A

get_last_name = lambda x: x.split()[-1]

df[‘last_name’] = df.name.apply(get_last_name)

Question 4

Q

Rename individual, compared to all, columns in a df

Answer

A

df.rename(columns = {‘old’:’new, ‘old2’:’new2’}, inplace = True)

Question 5

Q

Rename the following columns in order:

name, last name, user

Answer

A

df.columns = [‘first_name’, ‘last_name’, ‘id’]

Only works if you have the same # of columns as in the original. You can also accidentally overwrite something. Better to use the df.rename(columns={old:new}, inplace = True).

Question 6

Q

What would this produce:

[1,2] + [3,4]

Answer

A

[1,2,3,4]

Question 7

Q

What is the simple syntax to apply lambda to a row in a dataframe?

Answer

A

df.apply(lambda row: “what will be returned” row[‘name of row to act on’] , axis = 1)

Example:

total_earned = lambda row: (row.hourly_wage * 40) + ((row.hourly_wage * 1.5) * (row.hours_worked - 40)) if row.hours_worked > 40 else row.hourly_wage * row.hours_worked

df[‘total_earned’] = df.apply(total_earned, axis = 1)

Here the lambda is written seperatly, but it could be combined

Question 8

Q

What is the syntax for the .apply attribute?

Answer

A

df.col.apply(func)

EXample:

get_last_name = lambda x: x.split()[-1]

df[‘last_name’] = df.name.apply(get_last_name)

df.apply(row func, axis = 1)

Question 9

Q

Nested List Comprehensions

List 1 = [0,1,2,3,4,5]

List2 = [‘a’,’b’,’c’,’d’,’e’,’f’]

I want the value of list 2 for every value of list 1 that is less than 3

Answer

A

new_list = [list2[i] for i in range (0, len(list2)) if list1[i] < 3]

Question 10

Q

Create a new list named double_nums by multiplying each number in nums by two

nums = [4, 8, 15, 16, 23, 42]

Answer

A

double_nums = [i * 2 for i in nums]

Question 11

Q

create a list from 0 to 5

Answer

A

new_list = range(6)

Question 12

Q

Write this for loop as a list comp:

for x in l:

if x>=45: x+1

else: x+5

Answer

A

[x+1 if x >=45 else x + 5 for x in l]

Question 13

Q

Write this as an if statement

[x+1 for x in l if x >= 45]

Answer

A

if x >= 45:

x+1

Question 14

Q

What will this code produce?

nums = [4, 8, 15, 16, 23, 42]

parity = [0 if i%2 == 0 else 1 for i in nums]

print(parity)

Answer

A

[0, 0, 1, 0, 1, 0]

Question 15

Q

Write this as a list comp:

nums2 = [4, 8, 15, 16, 23, 42]

parity2 = []

for i in nums2:

if i%2 == 0:

parity2.append(0)

else:

parity2.append(1)

Answer

A

parity = [0 if i%2 == 0 else 1 for i in nums]

Question 16

Q

If numbers are above 45 then add 1, if num <10 subtract 1, else add 5

l = [22, 13, 45, 50, 98, 69, 43, 44, 1]

Answer

A

for i in l:

if i >45:

l_2.append(i+1)

elif i

l_2.append(i-1)

else:

l_2.append(i+5)

print(l_2)

l_3 = [i+1 if i > 45 else (i-1 if i<10 else i+5) for i in l]

Question 17

Q

Create a new list named first_character that contains the first character from every name in the list names

names = [“Elaine”, “George”, “Jerry”, “Cosmo”]

Answer

A

names = [“Elaine”, “George”, “Jerry”, “Cosmo”]

first_character = [i[0] for i in names]

print(first_character)

Question 18

Q

Create a new list called greater_than_two, in which an entry at position i is True if the entry in nums at position i is greater than 2.

Answer

A

nums = [5, -10, 40, 20, 0]

greater_than_two = [True if i >2 else False for i in nums]

print(greater_than_two)

Question 19

Q

Create a new list named product that contains the product of each sub-list of nested_lists

Answer

A

product = [x1 * x2 for (x1, x2) in nested_lists]

Question 20

Q

Create a new list named greater_than that contains True if the first number in the sub-list is greater than the second number in the sub-list, and False otherwise.

Answer

A

nested_lists = [[4, 8], [16, 15], [23, 42]]

greater_than = [True if x1 > x2 else False for (x1, x2) in nested_lists]

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=f9aK6u58ROeq

Question 21

Q

Create a new list named first_only that contains the first element in each sub-list of nested_lists.

Answer

A

nested_lists = [[4, 8], [16, 15], [23, 42]]

first_only = [x1 for (x1, x2) in nested_lists]

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=f2y5829cSLPQ

Question 22

Q

Use list comprehension and the zip function to create a new list named sums that sums corresponding items in lists a and b. For example, the first item in the new list should be 5 from adding 1 and 4 together.

a = [1.0, 2.0, 3.0]

b = [4.0, 5.0, 6.0]

Answer

A

sums = [x1 + x2 for (x1, x2) in zip(a,b)]

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=zzd_TyqLS24I&line=4&uniqifier=1

Question 23

Q

You’ve been given two lists: a list of capitals and a list of countries. Create a new list named locations that contains the string “capital, country” for each item in the original lists. For example, if the 5th item in the capitals list is “Lima” and the 5th item in the countries list is “Peru”, then the 5th item in the new list should be “Lima, Peru”

Answer

A

capitals = [“Santiago”, “Paris”, “Copenhagen”]

countries = [“Chile”, “France”, “Denmark”]

locations = [x1 + “, “ + x2 for (x1,x2) in zip(capitals, countries)]

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=jOeCN_NHUIy7

Question 24

Q

You’ve been given two lists: a list of names and a list of ages. Create a new list named users that contains the string “Name: name, Age: age” for each pair of elements in the original lists. For example, if the 5th item in the names list is “John”and the 5th item in ages is 42, then the 5th item in the new list should be”Name: John, Age: 42”.

As you did in the previous exercise, concatenate your strings together using +. Make sure to add proper capitalization and spaces.

Answer

A

names = [“Jon”, “Arya”, “Ned”]

ages = [14, 9, 35]

users = [“Name: “ + x1 + “, Age: “ + str(x2) for (x1,x2) in zip(names, ages)]

print(users)

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=6Ol4V5VjUva5

Question 25

Q

Create a new list named greater_than that contains True or False depending on whether the corresponding item in list a is greater than the one in list b. For example, if the 2nd item in list a is 3, and the 2nd item in list b is 5, the 2nd item in the new list should be False.

Answer

A

a = [30, 42, 10]

b = [15, 16, 17]

greater_than2= [True if x1 > x2 else False for (x1, x2) in zip(a,b)]

print(greater_than2)

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/List_Comprehension_Code_Challenge.ipynb#scrollTo=qefmBpfVVkUh

Question 26

Q

Create a lambda function named contains_a that takes an input word and returns True if the input contains the letter ‘a’. Otherwise, return False.

Answer

A

contains_a = lambda n: “a” in n

https://colab.research.google.com/github/skyelong/Code-Academy/blob/master/Python/Lambda.ipynb#scrollTo=bnAxko3shzmc

Question 27

Q

Create a lambda function named long_string that takes an input str and returns True if the string has over 12 characters in it. Otherwise, return False.

Answer

A

long_string = lambda x: True if len(x) > 12 else False

Question 28

Q

Create a lambda function named ends_in_a that takes an input str and returns True if the last character in the string is an a. Otherwise, return False.

Answer

A

ends_in_a = lambda x: True if x[-1] == “a” else False

Question 29

Q

Create a lambda function named add_random that takes an input named num. The function should return num plus a random integer number between 1 and 10 (inclusive).

Answer

A

add_random = lambda num: num + random.randint(1,10)

Question 30

Q

You run an online clothing store called Panda’s Wardrobe. You need a DataFrame containing information about your products.

Create a DataFrame with the following data that your inventory manager sent you:

Product ID Product Name Color

1 t-shirt blue

2 t-shirt green

3 skirt red

4 skirt black

Answer

A

df1 = pd.DataFrame({

‘Product ID’: [1, 2, 3, 4],

‘Product Name’: [‘t-shirt’, ‘t-shirt’, ‘skirt’, ‘skirt’],

‘Color’: [‘blue’, ‘green’, ‘red’, ‘black’]})

Question 31

Q

from this dataframe select all row for clinic north and clinic south

Answer

A

clinic_north_south = df[[‘clinic_north’, ‘clinic_south’]]

Question 32

Q

Use iloc to return the third row from df

Answer

A

march = df.iloc[2]

Question 33

Q

select rows 1-5 with .iloc from ‘df’

Answer

A

df_1 = df.iloc[1:6]

Question 34

Q

select rows from ‘df’ where the month is equal to ‘january’ and store it into a new series

Answer

A

january = df[df.month == ‘January’]

Question 35

Q

select all rows for both ‘march’ and ‘april’ from df and store them in a new df called march_april

Answer

A

march_april = df[(df.month == ‘March’) | (df.month == ‘April’)]

Question 36

Q

Use .isin to find rows containing ‘january’ and ‘march’ in the column month

Answer

A

january_february_march = df[df.month.isin([‘January’, ‘February’, ‘March’])

Question 37

Q

reset the index for a dataframe that you have subsettted. remove the old index

Answer

A

df2.reset_index(drop=True, inplace = True)

Question 38

Q

Create a new column that changes the names to lower case using the str.lower and .apply

df = pd.DataFrame([

[‘JOHN SMITH’, ‘john.smith@gmail.com’],

[‘Jane Doe’, ‘jdoe@yahoo.com’],

[‘joe schmo’, ‘joeschmo@hotmail.com’]

],

columns=[‘Name’, ‘Email’])

Answer

A

df[‘Lowercase Name’] = df.Name.apply(str.lower)

Question 39

Q

subject = [“physics”, “calculus”, “poetry”, “history”]

append ‘computer science’ to this list

Answer

A

subject.append(“computer science”)

Question 40

Q

subject = [“physics”, “calculus”, “poetry”, “history”]

grades = [98, 97, 85, 88]

zip these two together

and add ‘visual arts’ and the grade ‘93’

Answer

A

gradebook = list(zip(subject, grades))

gradebook.append((“visual arts”, 93))

Question 41

Q

inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]

find the len of this inventory

Answer

A

inventory_len = len(inventory)

Question 42

Q

inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]

return the third object

return the last object

return objects 1-4

Answer

A

third = inventory[2]

last = inventory[-1]

inventory1_4 = inventory[1:5]

Question 43

Q

inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]

return the number of twin beds in the inventory

Answer

A

twin_beds = inventory.count(‘twin bed’)

Question 44

Q

Write a function named append_sum that has one parameter — a list named named lst.

The function should add the last two elements of lst together and append the result to lst. It should do this process three times and then return lst.

For example, if lst started as [1, 1, 2], the final result should be [1, 1, 2, 3, 5, 8].

Answer

A

def append_sum(lst):

for x in range(3):

lst.append(lst[-1] + lst[-2])

return lst

Question 45

Q

Write a function named larger_list that has two parameters named lst1 and lst2.

The function should return the last element of the list that contains more elements. If both lists are the same size, then return the last element of lst1.

Answer

A

def larger_list(lst1, lst2):

list1_len = len(lst1)

list2_len = len(lst2)

if list1_len > list2_len:

return lst1[-1]

elif list1_len < list2_len:

return lst2[-1]

elif list1_len == list2_len:

return lst1[-1]

Question 46

Q

Create a function named more_than_n that has three parameters named lst, item, and n.

The function should return True if item appears in the list more than n times. The function should return False otherwise.

Answer

A

def more_than_n(lst,item,n):

if lst.count(item) > n :

return True

else:

return False

Question 47

Q

Create a function called append_size that has one parameter named lst.

The function should append the size of lst (inclusive) to the end of lst. The function should then return this new list.

For example, if lst was [23, 42, 108], the function should return [23, 42, 108, 3] because the size of lst was originally 3.

Answer

A

def append_size(lst):

x = len(lst)

lst.append(x)

return lst

lst = [23,42,108]

append_size(lst)

Question 48

Q

Create a function called every_three_nums that has one parameter named start.

The function should return a list of every third number between start and 100 (inclusive). For example, every_three_nums(91) should return the list [91, 94, 97, 100]. If start is greater than 100, the function should return an empty list.

Answer

A

def every_three_nums(start):

if start <= 100:

lst = list(range(start, 101, 3))

return lst

else:

lst = []

return lst

Question 49

Q

Create a function named remove_middle which has three parameters named lst, start, and end.

The function should return a list where all elements in lst with an index between start and end (inclusive) have been removed.

For example, the following code should return [4, 23, 42] because elements at indices 1, 2, and 3 have been removed: remove_middle([4, 8 , 15, 16, 23, 42], 1, 3)

Answer

A

def remove_middle(lst,start,end):

new_list = lst[0:start] + lst[end+1:]

return new_list

Question 50

Q

Create a function named more_frequent_item that has three parameters named lst, item1, and item2.

Return either item1 or item2 depending on which item appears more often in lst.

If the two items appear the same number of times, return item1.

Answer

A

def more_frequent_item (lst, item1, item2):

cnt_item1 = lst.count(item1)

cnt_item2 = lst.count(item2)

if cnt_item1 > cnt_item2:

return item1

elif cnt_item1 < cnt_item2:

return item2

elif cnt_item1 == cnt_item2:

return item1

print(more_frequent_item([2, 3], 2, 3))

Question 51

Q

Create a function named double_index that has two parameters: a list named lst and a single number named index.

The function should return a new list where all elements are the same as in lst except for the element at index. The element at index should be double the value of the element at index of the original lst.

If index is not a valid index, the function should return the original list.

For example, the following code should return [1,2,6,4] because the element at index 2 has been doubled:

double_index([1, 2, 3, 4], 2)

Answer

A

def double_index(lst, index):

before = lst[:index]

after = lst[index+1:]

new = [lst[index] * 2]

new_list = before + new + after

return new_list

Question 52

Q

Given a dataframe df, add a new column square which contains the square of each value in the points column for each row.

Answer

A

df[‘square’] = df.points.apply(lambda x: x**2)

Question 53

Q

Select the rows from the column location that contain the information for Staten Island from the dataframe inventory and save them to staten_island.

Answer

A

staten_island = inventory[inventory.location == ‘Staten Island’]

Question 54

Q

A customer just emailed you asking what products are sold at your Staten Island location. Select the column product_description from staten_island and save it to the variable product_request.

Answer

A

product_request = staten_island.product_description

Question 55

Q

Another customer emails to ask what types of seeds are sold at the Brooklyn location.

Select all rows where location is equal to Brooklyn and product_type is equal to seeds and save them to the variable seed_request.

Answer

A

seed_request = inventory[(inventory.location == ‘Brooklyn’) | (inventory.product_type == ‘seeds’)]

Question 56

Q

Add a column to inventory called in_stock which is True if quantity is greater than 0 and False if quantity equals 0.

Answer

A

inventory[‘in_stock’] = inventory.quantity.apply(lambda x: True if x > 0 else False)

Question 57

Q

Petal Power wants to know how valuable their current inventory is.

Create a column called total_value that is equal to price multiplied by quantity.

Answer

A

total = lambda row: row.price * row.quantity

inventory[‘total_value’] = inventory.apply(total, axis = 1)

Question 58

Q

The DataFrame customers contains the names and ages of all of your customers. You want to find the median age:

Answer

A

median_price = orders.price.median()

print(median_price)

Question 59

Q

how many unique types of shoes from the df orders were bought? The name of the column is shoe_type

Answer

A

unique_type = orders.shoe_type.nunique()

print(unique_type)

Question 60

Q

Print out all the unique types of shoes that are in ‘shoe_type’ in the dataframe orders

Answer

A

Print out all the unique types of shoes that are in ‘shoe_type’ in the dataframe orders

Question 61

Q

Our finance department wants to know the price of the most expensive pair of shoes purchased. Save your answer to the variable most_expensive.

Answer

A

most_expensive = orders.price.max()

Question 62

Q

Our fashion department wants to know how many different colors of shoes we are selling. Save your answer to the variable num_colors.

Answer

A

num_colors = orders.shoe_color.nunique()

num_colors

Question 63

Q

Suppose we have a grade book with columns student, assignment_name, and grade. We want to get an average grade for each student across all assignments. We could do some sort of loop, but Pandas gives us a much easier option: the method .groupby. Use .groupby to get the average grade

Answer

A

grades = df.groupby(‘student’).grade.mean()

Question 64

Q

This is the general syntax of .groupby

Answer

A

df.groupby(‘column1’).column2.measurement()

Answer 65

A

pricey_shoes = orders.groupby(‘shoe_type’).price.max()

Answer 66

A

df.groupby(‘column1’).column2.measurement().reset_index()

Answer 67

A

teas_counts = teas.groupby(‘category’).id.count().reset_index()

Answer 68

A

df = df.rename(columns = {‘id’ : ‘counts’})

Answer 69

A

pricey_shoes = orders.groupby(‘shoe_type’).price.max().reset_index()

pricey_shoes

Answer 70

A

high_earners = df.groupby(‘category’).wage.apply(lambda x: np.percentile(x,75)).reset_index()

Answer 71

A

cheap_shoes = orders.groupby(‘shoe_color’).price.apply(lambda x: np.percentile(x, 25)).reset_index()

Answer 72

A

df.pivot(columns=’ColumnToPivot’, index=’ColumnToBeRows’, values=’ColumnToBeValues’)

Answer 73

A

shoe_counts.pivot(columns= ‘shoe_color’, index= ‘shoe_type’, values= ‘id’).reset_index()

Answer 74

A

click_source = user_visits.groupby(‘utm_source’).id.count().reset_index()

Answer 75

A

click_source_by_month = user_visits.groupby([‘utm_source’, ‘month’]).id.count().reset_index()

Answer 76

A

click_source_by_month_pivot = click_source_by_month.pivot(index=’utm_source’, columns=’month’, values=’id’).reset_index()

Answer 77

A

movie_ratings.groupby(‘movie’).rating.mean().reset_index()

Answer 78

A

checkouts.groupby(‘location’).book_title.count().reset_index()

Answer 79

A

The ~ is a NOT operator, and isnull() tests whether or not the value of ad_click_timestamp is null.

Answer 80

A

ad_clicks[‘is_click’] = ~ad_clicks\ .ad_click_timestamp.isnull()

Answer 81

A

The top will provide you with a df that contains only the rows where revenue is greater than targets

the bottom will provide you with a series of True/False for the conditions revenue>target

Answer 82

A

. A Left Merge includes all rows from the first (left) table, but only rows from the second (right) table that match the first table.

Answer 83

A

this will result in a table that has only the rows with matching values. Non matching values will be dropped

Answer 84

A

this will result in a table with ‘NaN’ values for rows that do not match

Answer 85

A

Here, the merged table will include all rows from the second (right) table, but only rows from the first (left) table that match the second table.

Answer 86

A

It stacks the two dfs together. This is most useful when the two dataframes are chuncks of the same original df

Answer 87

A

pets_owners = pd.merge(pets, customers.rename(columns = {‘id’:’owner_id’}))

Answer 88

A

appointments_all = pd.concat([greg_appointments, susan_appointments])

Answer 89

A

merged_df = pd.merge(df1, df2, how=’outer’)

Answer 90

A

null_df = df[df.column1.isnull()]

Answer 91

A

x_values = [0, 1, 2, 3, 4] y_values = [0, 1, 4, 9, 16] plt.plot(x_values, y_values) plt.show()

Answer 92

A

plt.plot(days, money_spent, color=’green’) plt.plot(days, money_spent_2, color=’#AAAAAA’)

Answer 93

A

ax = plt.subplot()

Answer 94

A

plt.figure(figsize=(6,7))

Answer 95

A

plt.axis([-5,5,0,10])

Answer 96

A

plt.xlabel(‘Time’)

Answer 97

A

ax.set_yticks([0,1,2,4,9])

Answer 98

A

ax.set_xticklabels([‘Monday’,’Tuesday’,’Wednesday’])

Answer 99

A

plt.plot = (x,y,color=’green’)

Answer 100

A

plt.plot(x,y,linestyle=’–’

Answer 101

A

plt.legend([‘cats’,’Dogs’])

Answer 102

A

plt.subplot(3,2,3)

Answer 103

A

plt.subplots_adjust(wspace=0.35)

Answer 104

A

pie chart will show the percentages of each slice to the nearest int

Answer 105

A

dividing the height of each column by a constant so the area under the curve sums to 1. maintains the relationship of the data, but allows you to compare data that has different distributions

Answer 106

A

plt.bar(range(len(y2)), y2, bottom=y1)

Answer 107

A

lower bound y values

Answer 108

A

ax.set_xticklabels([’])

Answer 109

A

newlist = df.col.values

counts = cuisine_counts.name.values