DataFrames Flashcards

Question 1

Q

Array

Answer

A

one-dimensional
unordered collection
contains only one data type

Each item has an index and a value.

Question 2

Q

Three main components of tables

Answer

A

rows, columns, index

Question 3

Q

Rows of a table

Answer

A

“entry” or “observation”

Question 4

Q

Columns of a table

Answer

A

Each column of a table represents some attribute that entries (rows) have.

Question 5

Q

Index of a table

Answer

A

The first column
Meaningful or arbitrary
Unique values
Identify rows

Question 6

Q

Series

Answer

A

The most basic pandas object
Has two sections: the index and the values
Under the hood, columns of a Series are actually NumPy arrays

Question 7

Q

DataFrames

Answer

A

Pandas table object
Contains an Index, Rows, and Columns
Each column is a Series

Question 8

Q

How do you read a DataFrame?

Answer

A

pd.read_csv(filepath)

Question 9

Q

df.loc[]

Answer

A

Accesses rows/columns by label
Loc slicing is right-inclusive
Syntax: df.loc[A:B, C:D]

Question 10

Q

filtering using df.loc[]

Answer

A

Ex: movies.loc[movies[“Year”] < 1950]

You can also filter by more than one condition using
condition1 = movies[“Year”] >= 2000
condition2 = movies[“Studio”] == “Fox”

filtered_or = movies.loc[condition1 | condition2]

filtered_and = movies.loc[condition1 & condition2]

Question 11

Q

How do you assign columns to a DataFrame?

Answer

A

Using indexing/loc:
- df[“column”] data
Using df.assign():
- new_df = df.assign(label=data)

Question 12

Q

How do you sort DataFrames?

Answer

A

df.sort_values()

Ex: movies.sort_values(“Studio”, ascending=True)
Ex: movies.sort_values(“Year”, ascending=False)

Question 13

Q

df.groupby()

Answer

A

Creates new df grouped by certain column(s)
Ex: df.groupby([col1, col2, …])

Question 14

Q

Ways of grouping by 2 columns

Answer

A

using df.groupby()
Ex: movies.groupby([“Year”, “Studio”])[“Title”].count().to_frame()
using df.pivot_table()
Ex: pt = movies.pivot_table(values=”Title”, index=”Year”, columns=”Studio”, aggfunc=”count”)

Question 15

Q

Merging DataFrames

Answer

A

pd.merge()
Inner Join:
- This will only include rows with a match in both DataFrames.
Ex: pd.merge(adf, bdf, how=”inner”, on=”x1”)
Outer Join:
- This will retain all rows in both DataFrames.
Ex: pd.merge(adf, bdf, how=”outer”, on=”x1”)
Left Join:
Use all rows form the First DataFrame
Right Join:
Use all rows from the second DataFrame

Question 16

Q

df.apply()

Answer

Study These Flashcards

A

Applies a function to a DataFrame
Applying to a column:
- Will return a Series with the function applied to each row in the column
df[column].apply(function)
It is possible to apply a function to a row with axis=1
df.apply(function, axis=1)

DataFrames Flashcards

(16 cards)