pandas lesson 3 Flashcards

1
Q

How do I show only the first 10 rows (use slicing).

A

data[‘duration’][:10]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do I check if column has NaN values? Do this for the column country in the data df.

A

data.country.isna()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Replace any NaN values inthe country column in the data df.

A

data.country.fillna(‘’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Replace any NaN values in the column duration with the mean value.

A

data.duration = data.duration.fillna(data.duration.mean())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Get rid of any rows that have a missing value.

A

data.dropna()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Drop rows that have all na values

A

data.dropna(how=’all’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Drop rows where there are 5 na values.

A

data.dropna(thresh=5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Drop rows within a column that has na values in it. Do this for the column ‘title_year’.

A

data.dropna(subset=[‘title_year’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Show columns that contain any na values. Filter out, don’t drop.

A

data[data.isna().any(axis=1)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Drop columns that are all na values.

A

data.dropna(axis=1, how=’all’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Drop columns that have any na values.

A

data.dropna(axis=1, how=’any’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Save the results to a csv file

A

data.to_csv(r’C:\Users\User\Documents\CFG_DATA\Data_files\movie_metadata.csv’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Read the csv file and ensure that duration is an integer.

A

data = pd.read_csv(r’C:\Users\User\Documents\CFG_DATA\Data_files\movie_metadata.csv’, dtype={‘duration’: int})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Read the csv file and ensure that actor_2_facebook_likes is a string.

A

data = pd.read_csv(r’C:\Users\User\Documents\CFG_DATA\Data_files\movie_metadata.csv’, dtype={‘actor_2_facebook_likes’: str})

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Change all characters in column movie_title to capital letters.

A

data[‘movie_title’].str.upper()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Remove any trailing whitespace from movie_title

A

data[‘movie_title’].str.strip()

17
Q

Rename columns title_year to release_date and movie_facebook_likes to facebook_likes

A

data.rename(columns = {‘title_year’:’release_date’, ‘movie_facebook_likes’:’facebook_likes’})

18
Q

Export your df back to a csv file.

A

data.to_csv(r’C:\Users\User\Documents\CFG_DATA\Data_files\cleanfile.csv’, encoding=’utf-8’)