Basic Python Flashcards
Write a lambda function that adds a column to the dataframe ‘shoes’ that returns Not Vegan if the shoe is made out of leather and Vegan if it is not.
df[‘vegan’] = df.shoe_type.apply(lambda x: ‘Not_Vegan’ if x == ‘Leather’ else ‘Vegan’)
Use the columns ‘gender’ and ‘last_name’ from ‘df’ to make a new column called ‘salutation’ that returns ‘Dear Mr.’ and the last name if gender is male or ‘Dear Ms.’ and last name if female.
df[‘salutations’] = df.apply(lambda row: “Dear Mr. “ + row.last_name if row.gender == ‘male’ else “Dear Ms. “, axis = 1)
Get the last name only from a column that has the format “Skye Long” called “name”
get_last_name = lambda x: x.split()[-1]
df[‘last_name’] = df.name.apply(get_last_name)
Rename individual, compared to all, columns in a df
df.rename(columns = {‘old’:’new, ‘old2’:’new2’}, inplace = True)
Rename the following columns in order:
name, last name, user
df.columns = [‘first_name’, ‘last_name’, ‘id’]
Only works if you have the same # of columns as in the original. You can also accidentally overwrite something. Better to use the df.rename(columns={old:new}, inplace = True).
What would this produce:
[1,2] + [3,4]
[1,2,3,4]
What is the simple syntax to apply lambda to a row in a dataframe?
df.apply(lambda row: “what will be returned” row[‘name of row to act on’] , axis = 1)
Example:
total_earned = lambda row: (row.hourly_wage * 40) + ((row.hourly_wage * 1.5) * (row.hours_worked - 40)) if row.hours_worked > 40 else row.hourly_wage * row.hours_worked
df[‘total_earned’] = df.apply(total_earned, axis = 1)
Here the lambda is written seperatly, but it could be combined
What is the syntax for the .apply attribute?
df.col.apply(func)
EXample:
get_last_name = lambda x: x.split()[-1]
df[‘last_name’] = df.name.apply(get_last_name)
df.apply(row func, axis = 1)
Nested List Comprehensions
List 1 = [0,1,2,3,4,5]
List2 = [‘a’,’b’,’c’,’d’,’e’,’f’]
I want the value of list 2 for every value of list 1 that is less than 3
new_list = [list2[i] for i in range (0, len(list2)) if list1[i] < 3]
Create a new list named double_nums by multiplying each number in nums by two
nums = [4, 8, 15, 16, 23, 42]
double_nums = [i * 2 for i in nums]
create a list from 0 to 5
new_list = range(6)
Write this for loop as a list comp:
for x in l:
if x>=45: x+1
else: x+5
[x+1 if x >=45 else x + 5 for x in l]
Write this as an if statement
[x+1 for x in l if x >= 45]
if x >= 45:
x+1
What will this code produce?
nums = [4, 8, 15, 16, 23, 42]
parity = [0 if i%2 == 0 else 1 for i in nums]
print(parity)
[0, 0, 1, 0, 1, 0]
Write this as a list comp:
nums2 = [4, 8, 15, 16, 23, 42]
parity2 = []
for i in nums2:
if i%2 == 0:
parity2.append(0)
else:
parity2.append(1)
parity = [0 if i%2 == 0 else 1 for i in nums]
If numbers are above 45 then add 1, if num <10 subtract 1, else add 5
l = [22, 13, 45, 50, 98, 69, 43, 44, 1]
for i in l:
if i >45:
l_2.append(i+1)
elif i
l_2.append(i-1)
else:
l_2.append(i+5)
print(l_2)
l_3 = [i+1 if i > 45 else (i-1 if i<10 else i+5) for i in l]
Create a new list named first_character that contains the first character from every name in the list names
names = [“Elaine”, “George”, “Jerry”, “Cosmo”]
names = [“Elaine”, “George”, “Jerry”, “Cosmo”]
first_character = [i[0] for i in names]
print(first_character)
Create a new list called greater_than_two, in which an entry at position i is True if the entry in nums at position i is greater than 2.
nums = [5, -10, 40, 20, 0]
greater_than_two = [True if i >2 else False for i in nums]
print(greater_than_two)
Create a new list named product that contains the product of each sub-list of nested_lists
product = [x1 * x2 for (x1, x2) in nested_lists]
Create a new list named greater_than that contains True if the first number in the sub-list is greater than the second number in the sub-list, and False otherwise.
nested_lists = [[4, 8], [16, 15], [23, 42]]
greater_than = [True if x1 > x2 else False for (x1, x2) in nested_lists]
Create a new list named first_only that contains the first element in each sub-list of nested_lists.
nested_lists = [[4, 8], [16, 15], [23, 42]]
first_only = [x1 for (x1, x2) in nested_lists]
Use list comprehension and the zip function to create a new list named sums that sums corresponding items in lists a and b. For example, the first item in the new list should be 5 from adding 1 and 4 together.
a = [1.0, 2.0, 3.0]
b = [4.0, 5.0, 6.0]
sums = [x1 + x2 for (x1, x2) in zip(a,b)]
You’ve been given two lists: a list of capitals and a list of countries. Create a new list named locations that contains the string “capital, country” for each item in the original lists. For example, if the 5th item in the capitals list is “Lima” and the 5th item in the countries list is “Peru”, then the 5th item in the new list should be “Lima, Peru”
capitals = [“Santiago”, “Paris”, “Copenhagen”]
countries = [“Chile”, “France”, “Denmark”]
locations = [x1 + “, “ + x2 for (x1,x2) in zip(capitals, countries)]
You’ve been given two lists: a list of names and a list of ages. Create a new list named users that contains the string “Name: name, Age: age” for each pair of elements in the original lists. For example, if the 5th item in the names list is “John”and the 5th item in ages is 42, then the 5th item in the new list should be”Name: John, Age: 42”.
As you did in the previous exercise, concatenate your strings together using +. Make sure to add proper capitalization and spaces.
names = [“Jon”, “Arya”, “Ned”]
ages = [14, 9, 35]
users = [“Name: “ + x1 + “, Age: “ + str(x2) for (x1,x2) in zip(names, ages)]
print(users)
Create a new list named greater_than that contains True or False depending on whether the corresponding item in list a is greater than the one in list b. For example, if the 2nd item in list a is 3, and the 2nd item in list b is 5, the 2nd item in the new list should be False.
a = [30, 42, 10]
b = [15, 16, 17]
greater_than2= [True if x1 > x2 else False for (x1, x2) in zip(a,b)]
print(greater_than2)
Create a lambda function named contains_a that takes an input word and returns True if the input contains the letter ‘a’. Otherwise, return False.
contains_a = lambda n: “a” in n
Create a lambda function named long_string that takes an input str and returns True if the string has over 12 characters in it. Otherwise, return False.
long_string = lambda x: True if len(x) > 12 else False
Create a lambda function named ends_in_a that takes an input str and returns True if the last character in the string is an a. Otherwise, return False.
ends_in_a = lambda x: True if x[-1] == “a” else False
Create a lambda function named add_random that takes an input named num. The function should return num plus a random integer number between 1 and 10 (inclusive).
add_random = lambda num: num + random.randint(1,10)
You run an online clothing store called Panda’s Wardrobe. You need a DataFrame containing information about your products.
Create a DataFrame with the following data that your inventory manager sent you:
Product ID Product Name Color
1 t-shirt blue
2 t-shirt green
3 skirt red
4 skirt black
df1 = pd.DataFrame({
‘Product ID’: [1, 2, 3, 4],
‘Product Name’: [‘t-shirt’, ‘t-shirt’, ‘skirt’, ‘skirt’],
‘Color’: [‘blue’, ‘green’, ‘red’, ‘black’]})
from this dataframe select all row for clinic north and clinic south
clinic_north_south = df[[‘clinic_north’, ‘clinic_south’]]
Use iloc to return the third row from df
march = df.iloc[2]
select rows 1-5 with .iloc from ‘df’
df_1 = df.iloc[1:6]
select rows from ‘df’ where the month is equal to ‘january’ and store it into a new series
january = df[df.month == ‘January’]
select all rows for both ‘march’ and ‘april’ from df and store them in a new df called march_april
march_april = df[(df.month == ‘March’) | (df.month == ‘April’)]
Use .isin to find rows containing ‘january’ and ‘march’ in the column month
january_february_march = df[df.month.isin([‘January’, ‘February’, ‘March’])
reset the index for a dataframe that you have subsettted. remove the old index
df2.reset_index(drop=True, inplace = True)
Create a new column that changes the names to lower case using the str.lower and .apply
df = pd.DataFrame([
[‘JOHN SMITH’, ‘john.smith@gmail.com’],
[‘Jane Doe’, ‘jdoe@yahoo.com’],
[‘joe schmo’, ‘joeschmo@hotmail.com’]
],
columns=[‘Name’, ‘Email’])
df[‘Lowercase Name’] = df.Name.apply(str.lower)
subject = [“physics”, “calculus”, “poetry”, “history”]
append ‘computer science’ to this list
subject.append(“computer science”)
subject = [“physics”, “calculus”, “poetry”, “history”]
grades = [98, 97, 85, 88]
zip these two together
and add ‘visual arts’ and the grade ‘93’
gradebook = list(zip(subject, grades))
gradebook.append((“visual arts”, 93))
inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]
find the len of this inventory
inventory_len = len(inventory)
inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]
return the third object
return the last object
return objects 1-4
third = inventory[2]
last = inventory[-1]
inventory1_4 = inventory[1:5]
inventory = [‘twin bed’, ‘twin bed’, ‘headboard’, ‘queen bed’, ‘king bed’, ‘dresser’, ‘dresser’, ‘table’, ‘table’, ‘nightstand’, ‘nightstand’, ‘king bed’, ‘king bed’, ‘twin bed’, ‘twin bed’, ‘sheets’, ‘sheets’, ‘pillow’, ‘pillow’]
return the number of twin beds in the inventory
twin_beds = inventory.count(‘twin bed’)
Write a function named append_sum that has one parameter — a list named named lst.
The function should add the last two elements of lst together and append the result to lst. It should do this process three times and then return lst.
For example, if lst started as [1, 1, 2], the final result should be [1, 1, 2, 3, 5, 8].
def append_sum(lst):
for x in range(3):
lst.append(lst[-1] + lst[-2])
return lst