Communicating Results Flashcards

Summarizing descriptive statistics, plotting visualizations, drawing conclustions, and customizing visuals to communicate results

1
Q

What is the .groupby() method

A

This allows you to group data by columns and aggregate info about groupings. The numeric_only excludes values that aren’t numeric.

df.groupby(“column_name”).mean(numeric_only=True)
or
df.groupby([“workclass”,”race”], as_index=False)[“capital-gain”].mean()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is summation or .sum()?

A

It aggregates data vertically .sum(axis=0) or horizontally .sum(axis=1).

df_census[[“capital_gain”,”capital-loss”]].sum()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Visualize how to get the sum while using .groupby and then sort the values in descending order

A

df.groupby(by=”column”).sum(numeric_only=True).sort_values(by=”column2”, ascending=False)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the measures of center?

A

Mean = .mean()
Median = .medain()
Mode = .mode()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the mean?

A

It is the average or sum of all numbers in set/by number of values in the set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the median

A

The center value in a set. Always sort the values first then calculate the median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mode?

A

It is the value with the highest frequency in a set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly