Pandas Flashcards by Nicole Geist

How to check the first 5 rows of df?

df.head( )

How well did you know this?

Not at all

Perfectly

How to load a csv file (with ; delimiter)

df = pd. read_csv (“filename.csv”, delimiter = “;” )

How well did you know this?

Not at all

Perfectly

How to get number of rows and columns?

df. shape

How well did you know this?

Not at all

Perfectly

How to get summary statistics of numerical variables?

df. describe( )

How well did you know this?

Not at all

Perfectly

How to check for missing values?

df. isna( ).sum

How well did you know this?

Not at all

Perfectly

How to get only rows where the neighborhood is Manhatten?

subset = df [df [ “neighborhood”] == “Manhattan”]

How well did you know this?

Not at all

Perfectly

How to get only rows where room type is private and and neighborhood is Brooklyn?

subset = df [(df [“room_type”] == “Private room”) & (df[“neighborhood”] == “Brooklyn”)]

How well did you know this?

Not at all

Perfectly

How to sort rows by price in descending order?

sorted_df = df.sort_values( by=”price”, ascending=False )

How well did you know this?

Not at all

Perfectly

How to create a new column price_per_night by dividing price by minimum_nights

df [“price_per_night”] = df [“price”] / df [“minimum_nights”]

How well did you know this?

Not at all

Perfectly

How to convert ‘last_review’ to datetime format?

df[“lastreview”] = pd.todatetime(df[“lastreview”], errors=”coerce”)

How well did you know this?

Not at all

Perfectly

How to extract the year from “last_review”?

df [“review_year”] = df [“last_review”].dt.year

How well did you know this?

Not at all

Perfectly

How to find the average price of listings per neighbourhood_group?

avg_df = df.groupby( “neighbourhood_group” )[“price”].mean().reset_index()

How well did you know this?

Not at all

Perfectly

How to fill missing values in ‘reviews_per_month’ with 0?

df [“reviews_per_month”] = df [“reviews_per_month”].fillna(0)

How well did you know this?

Not at all

Perfectly

How to drop rows where price is missing?

df = df.dropna(subset = [“price”])

How well did you know this?

Not at all

Perfectly

How to group by neighborhood and count the listings?

df_counts = airbnb.groupby (“neighbourhood_group”).size().reset_index (name=”num_listings”)

How well did you know this?

Not at all

Perfectly

How to find row with most expensive price - returns ID?

Study These Flashcards

most_exp = df [“price”]. idxmax( )

How to access multiple columns?

Study These Flashcards

print(df[[“var1”, “var2”, “var3”]])

How to count unique room types?

Study These Flashcards

print(df[“room_type”].nunique( ))

How to see last 5 rows?

Study These Flashcards

print(df[-5:])

How to show all rows where neighborhood is Manhatten using .loc ?

Study These Flashcards

print(df.loc[df[“neighbourhood”] == “Midtown”])

How to retrieve only the name, room_type, and price columns for all listings in Brooklyn?

Study These Flashcards

print( df.loc [df [“neighbourhood_group”] == “Brooklyn”, [“name”, “room_type”, “price”]])

How to get rows 3 to 7 and columns 1 to 4 using .iloc ?

Study These Flashcards

print(df.iloc[3:7, 1:5])

How to get all rows, selected columns, i.e. name / price ?

Study These Flashcards

df.loc[:, [“name”, “price”]]

How to sort df by neighborhood ascending and price descending?

Study These Flashcards

df.sort_values([“neighbourhood_group”, “price”], ascending=[True, False])

How to drop rows where price is missing?

df.dropna(subset=["price"], inplace=True)

How to remove rows with missing values?

df.dropna()

How to create a new var where price over 100 is labelled high and everything else is low?

df ["price_category"] = df ["price"].apply (lambda x: "High" if x > 100 else "Low")

How to get the number of values for each neighborhood group?

df["neighbourhood_group"].value_counts()

Pandas Flashcards

(28 cards)