Week 2 Summarizing Quantitative Data Flashcards

1
Q

Goal of summarizing quantitative data: we want to know the ____, ____, ____ of the data.
Shape: ____
Center: ____
Spread: ____

A

Goal: We want to show the shape, center and the spread of the data.
Shape: describes the distribution of ordered variable values.
Center: describes typical value(s) of a variable (e.g. mean, median).
Spread: describes how concentrated the values of the variables are/range of the values (e.g. standard deviation, range, IQR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Summarizing by visual
____(Graph type 1)
(3 General properties)

It shows shape/center/spread.

____(Graph type 2)
____ (graph name 1), ____ (graph name 2)
(2 properties for graph 1)
(3 propertied for graph 2)

____ (Graph type 3)
(3 general properties)
First Quartile (Q1):
Second Quartile (Q2):
Third Quartile (Q3):

It shows shape/center/spread.

A

Summarizing by visual
Histogram
x-axis: range
y-axis: frequency
Size of the bin: wider bins show general data distribution, smaller bins show more details of data.

Shows the shape and spread of the data.

Alternatives of histogram
Both show the actual values of the data.

Dot plot
Each dot represents a single data point,
Same data points stack on top of each other to represent frequency

Stem-and-leaf plot
Each number represents a single data point,
Each number represents the first digit of that specific data point,
Data points within the same bin range stack on top of each other to represent frequency.

Box plot/box and whiskers plot
x-axis: spread
Box: interquartile range, range of the middle 50% of the data values.
Whiskers: show the range of first and last 25% of the data values, excluding extreme outliers.

First Quartile (Q1) is the 25th percentile of the data.
Second Quartile (Q2) is the 50th percentile.
Third Quartile (Q3) is the 75th percentile of the data.

Shows the center (median), spread of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Summarizing by numerics
Measures of the center
Mean: ____
Median: ____

(Mean/median) (is/isn’t) robust to outliers.

Mean~median though the shape of the data
If the shape of the data is ____, mean and median is the same.
If the shape of the data is ____, (mean/median) is a better measure.
Left-skewed:
(mean ~ median)
Median closer to the (2nd/3rd) quartile.
Right-skewed:
(mean ~ median)
Median closer to the (2nd/3rd) quartile.

Measure of the spread
Standard deviation: ____
Range: ____
IQR:____

(Standard deviation/range/IQR) (is/isn’t) robust to outliers.

A

Summarizing by numeric
Measures of the center
Mean: average of all measurements of the variable in the data
Median: middle value of the ordered measurements of the variable in the data

Median is robust to outliers, mean is not.
Robust to outliers: not significantly affected by extreme values in the data.

Median ~ mean through the shape of data
if the data is symmetric, the mean and median is approximately the same.
If the data isn’t symmetric, the median is a better measure than the mean.

Right skewed: The tail of the distribution extends to the right toward larger values.
Mean > Median,
The median value is closer to the second quartile.
The few extremely large values pull the mean to the right.

Left skewed: The tail of the distribution extends to the left toward smaller values.
Mean < Median
The median value is closer to the third quartile.
The few extremely small values pull the mean to the left.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Interpreting the results
Talk about ____ of this study.
____ can be concluded by ____.

A

Interpreting the results
Talk about the shape, center, spread of this study.
Shape can be concluded by the histogram or by mean~median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Lecture check-in question 1
A researcher reports that, on average, participants in a study lost 10.4 pounds after two months on a new diet. A friend of yours comments that she tried the diet for two months and lost no weight. Which of the following statements is correct?

Your friend must not have followed the diet correctly since she did not lose weight.

Because your friend did not lose weight, there is likely bias in the research findings.

It is possible that some of the study participants lost no weight at all, like your friend, since the research findings are based on an average.

For the study findings to be trusted, we should add your friend’s results to those of the study and calculate the new average.

A

C
Average and median are supposed to reflect the typical value of a variable, where the majority of individuals lie, but it doesn’t reflect the spread.

So this friend’s experience is consistent with the variability inherent in the study results, losing 10.4 pounds on average and losing no weight on an individual is possible to happen at the same time.

So A, B and D are incorrect.
For B additionally, bias would need to be assessed based on study design, methodology, or sampling, A single anecdotal case does not imply bias in the research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

There are three children in a room, age 3, 4, and 5. What will happen to the mean and standard deviation of the ages if a 4-year-old child enters the room?

The mean age and standard deviation will increase.

The mean age and standard deviation will stay the same.

The mean age will stay the same, but the standard deviation will increase.

The mean age will stay the same, but the standard deviation will decrease.

A

D
Mean measures the center of the data: The typical value of individual measurements.

Standard deviation is the typical difference between the individual measurements and the mean, measuring the spread of the data: how concentrated are the individual measurements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly