Pandas lesson 2 Flashcards

1
Q

Find out how many sheets are in the excel

A

len(all_sheets)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Create a variable that reads only the sheet name from the 2000s

A

selected_sheet_only_w_index = pd.read_excel(r’C:\Users\User\Documents\CFG_DATA\Data_files\movies.xls’, sheet_name = 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Create a variable that reads all sheets from the excel file.

A

all_sheets = pd.read_excel(r’C:\Users\User\Documents\CFG_DATA\Data_files\movies.xls’, sheet_name = None)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do I find out the number of rows in excel tab 2010

A

selected_sheet_2010s = pd.read_excel(r’C:\Users\User\Documents\CFG_DATA\Data_files\movies.xls’, sheet_name = ‘2010s’)
len(selected_sheet_2010s)

can also do
len(all_sheets[‘2010s’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Create a new column where you increase the movie budget by 10%. Do this for only the first sheet.

A

def inc_by_10_pc(num):
return num * 1.10

movies_df[‘incrsd_budget’] = movies_df[‘Budget’].apply(inc_by_10_pc)
movies_df[[‘Budget’, ‘incrsd_budget’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Create a new column where you decrease the IMDB score by 1. Do this for only the first sheet.

A

movies_df = first_sheet_only
#adding column - subtract 1 point from IMDB Score

def minus1(num):
return num-1

movies_df[‘Decrease IMDB Score’] = movies_df[‘IMDB Score’].apply(minus1)
movies_df[[‘Decrease IMDB Score’, ‘IMDB Score’]]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Group data by year and then calculate mean gross earnings for each year. Use the movies_df

A

movies_mean = movies_df.groupby(‘Year’)[‘Gross Earnings’].mean()
movies_mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Count number of movies by each country.

A

count_country = movies_df.groupby(‘Country’)[‘Title’].count()
count_country

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Create a data series and add an index

emojis = dict(
USA = ‘🇺🇸’,
UK = ‘🇬🇧’,
France = ‘🇫🇷’,
Canada = ‘🇨🇦’,
Australia = ‘🇦🇺’,
Germany = ‘🇩🇪’,
Italy = ‘🇮🇹’,
Japan = ‘🇯🇵’,
Spain = ‘🇪🇸’ )

A

emojis = dict(
USA = ‘🇺🇸’,
UK = ‘🇬🇧’,
France = ‘🇫🇷’,
Canada = ‘🇨🇦’,
Australia = ‘🇦🇺’,
Germany = ‘🇩🇪’,
Italy = ‘🇮🇹’,
Japan = ‘🇯🇵’,
Spain = ‘🇪🇸’
)

flag_df = pd.Series(emojis, name=’flag’).reset_index()
flag_df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Create a new variable, movies_by_country_df where you create a new DataFrame from the series movies_by_country. The column headings should be Title and Title counts.

A

movies_by_country_df = movies_by_country_count.to_frame().rename(columns={“Title”: “Title counts”})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Join the two Dataframes movies_by_country and flag_df (left - Country, right - index)

A

movies_by_country_df.merge(flag_df, left_on=’Country’, right_on=’index’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Join two Dataframes movies_by_country and flag_df on one column (Country)

A

movies_by_country_df.merge(flag_df, on=’Country’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly