Python Transforming DataFrames Flashcards

Question 1

Q

Print the first four rows of the homelessness DataFrame.

Answer

A

print(homelessness.head())

Question 2

Q

Print data type info about the column types and missing values in homelessness.

Answer

A

print(homelessness.info())

Question 3

Q

Print the number of rows and columns in homelessness.

Answer

A

print(homelessness.shape)

Question 4

Q

Print some summary statistics that describe the homelessness DataFrame.

Answer

A

print(homelessness.describe())

Question 5

Q

import pandas

Answer

A

import pandas as pd

Question 6

Q

Print the column names of homelessness.

Answer

A

print(homelessness.columns)

Question 7

Q

Print the row index of homelessness

Answer

A

print(homelessness.index)

Question 8

Q

Pass the name of the ‘individuals’ column that you want to sort on into .sort_values() as homelessness_ind.

Answer

A

homelessness_ind = homelessness.sort_values(‘individuals’)

Question 9

Q

Sort homelessness by the number of homeless family_members in descending order, and save this as homelessness_fam.

Answer

A

homelessness_fam = homelessness.sort_values(‘family_members’, ascending=False)

Question 10

Q

Sort homelessness first by region (ascending), and then by number of family members (descending). Save this as homelessness_reg_fam.

Answer

A

homelessness_reg_fam = homelessness.sort_values([‘region’,’family_members’], ascending=[True, False])

Question 11

Q

Create a DataFrame called individuals that contains only the individuals column of homelessness.

Answer

A

individuals = homelessness[“individuals”]

Question 12

Q

Create a DataFrame called state_fam that contains only the state and family_members columns of homelessness, in that order.

Answer

A

state_fam = homelessness[[‘state’,’family_members’]]

Question 13

Q

Create a DataFrame called ind_state that contains the individuals and state columns of homelessness, in that order.

Answer

A

ind_state = homelessness[[‘individuals’,’state’]]

Question 14

Q

Filter homelessness for cases where the number of individuals is greater than ten thousand, assigning to ind_gt_10k

Answer

A

ind_gt_10k = homelessness[homelessness[‘individuals’]>10000]

Question 15

Q

Filter homelessness for cases where the USA Census region is “Mountain”, assigning to mountain_reg

Answer

A

mountain_reg = homelessness[homelessness[“region”]==
“Mountain”]

Question 16

Q

Filter homelessness for cases where the number of family_members is less than one thousand and the region is “Pacific”, assigning to fam_lt_1k_pac

Answer

Study These Flashcards

A

fam_lt_1k_pac = homelessness[(homelessness [‘family_members’]<1000) & (homelessness[‘region’]==’Pacific’)]

Question 17

Q

Filter homelessness for cases where the USA census region is “South Atlantic” or it is “Mid-Atlantic”, assigning to south_mid_atlantic

Answer

Study These Flashcards

A

south_mid_atlantic = homelessness [(homelessness[‘region’]==’South Atlantic’) | (homelessness[‘region’]==’Mid-Atlantic’)]

Question 18

Q

Filter homelessness for cases where the USA census state is in the list of Mojave states, canu, assigning to mojave_homelessness. Given that
canu = [“California”, “Arizona”, “Nevada”, “Utah”]

Answer

Study These Flashcards

A

mojave_homelessness = homelessness
[homelessness[‘state’].isin(canu)]

Question 19

Q

Add a column to homelessness, indiv_per_10k, containing the number of homeless individuals per ten thousand people in each state.

Answer

Study These Flashcards

A

homelessness[“indiv_per_10k”] = 10000 * homelessness[‘individuals’] / homelessness[‘state_pop’]

Question 20

Q

Subset rows where indiv_per_10k is higher than 20, assigning to high_homelessness.

Answer

Study These Flashcards

A

high_homelessness = homelessness[homelessness[“indiv_per_10k”]>20]

Question 21

Q

Sort high_homelessness by descending indiv_per_10k, assigning to high_homelessness_srt.

Answer

Study These Flashcards

A

high_homelessness_srt = high_homelessness.sort_values(‘indiv_per_10k’, ascending=False)

Question 22

Q

Select only the state and indiv_per_10k columns of high_homelessness_srt and save as result. Look at the result.

Answer

Study These Flashcards

A

result = high_homelessness_srt
[[‘state’,’indiv_per_10k’]]
print(result)

Question 23

Q

Answer

Study These Flashcards

A

Python Transforming DataFrames Flashcards

(23 cards)