STAT MOD 2: Chapter 3 Flashcards

1
Q

What is the measure of center?

A

where the data distribution is located along the number line

provides information about what is TYPICAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the appropriate measure of center if distribution is symmetric?

A

mean/average = symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the appropriate measure of center if distribution is skewed?

A

median = skewed or has outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is measure of spread?

A

how much variability is in a data distribution

provides information about how much individual values tend to DIFFER from one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the appropriate measure of spread if distribution is symmetric?

A

standard deviation = symmetric

if applicable, use empirical rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the appropriate measure of spread if distribution is skewed?

A

Interquartile range = skewed or has outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is notation (n)?

A

the number of observations in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is mean? How do you compute mean?

A

numerical average

Excel command: =AVERAGE(dataset)

Add up all values then divide by number of values

Hint: might be a list of numbers or stem-and-leaf plot, dotplot display

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the median? How do you compute the median?

A

Middle data value for an odd number of observations or average of two middle values for an even

Excel command: =MEDIAN(dataset)

Order the values, find middle value or average of middle lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Is the mean resistant to outliers?

A

No, mean is a non-resistant measure

Outliers affect mean because the mean takes into account all of the values and pulls the mean towards it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is the median resistant to outliers?

A

Yes, median is a resistant measure

Outliers do not affect mean as it’s just the middle value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the mean/median on the histogram?

A

Mean: the point where the histogram would balance

Median: the point where half the area falls to the left and half to the right (SPLITS in the middle)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where does mean/median fall relative to one another in a SYMMETRIC distribution?

A

mean and median are both in the middle or approximately the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where does mean/median fall relative to one another in a RIGHT SKEWED distribution?

A

mean > median (more to the right than median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Where does mean/median fall relative to one another in a LEFT SKEWED distribution?

A

mean < median (more to the left than median)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you compute range?

A

Range = maximum - minimum

17
Q

What is interquartile range? How do you compute interquartile range?

A

How spread out the middle half of the data is

IQR = Q3 - Q1

18
Q

How do you find third quartile?

A

Taking median of upper half of the ordered data values (all numbers larger than median of dataset)

19
Q

How do you find first quartile of the data set?

A

Taking median of lower half of the ordered data values (all numbers smaller than median of dataset)

20
Q

What is standard deviation?

A

Roughly the average distance from the mean (how many units away from the mean)

21
Q

What is the sample standard deviation?

A

How far away on average are the observations from the mean

  • if given several datasets, know which has largest or smallest standard deviations
22
Q

What kind of data sets would have larger standard deviations?

A

Data sets with bigger spreads or more spread/variability have higher standard deviation

23
Q

Notation for mean of population and sample?

A

µ - mean of population
x bar - mean of sample

24
Q

Notation for standard deviation of population and sample?

A

sigma - population
s - sample

25
What are resistant/robust measures?
Median (center) Interquartile Range (spread)
26
What are non-resistant/robust measures?
Mean (center) Standard deviation (spread) - Range - Correlation
27
How do you construct a box plot?
Boxplot: visual display for five-number summary (MIN, Q1, MEDIAN, Q3, MAX) 1) Label axis (whiskers) with the minimum and maximum of data - if min/max are outliers, use asterisk and use the next largest/smallest value as endpoint 2) Draw box with lower end at Q1 and upper end at Q3 3) Draw a line through the box at median
28
What is an outlier?
A value that falls more than 1.5 IQR below QI or more than 1.5
29
How do you determine if a value is an outlier?
Use fences method - first find Q1, Q3, and IQR Lower fence = Q1 - 1.5(IQR) - if number falls below the computed value, outlier Upper fence = Q3 + 1.5(IQR) - if number falls above the computed value, outlier
30
What elements do you have to know about comparative box plots to compare distribution of a variable for different groups?
X indicates mean, line indicates median - know whether box plots are symmetric, right skewed, left skewed - know which box plot has bigger/smaller range, high IQR, larger center, spread
31
What is the empirical rule?
68-95-99.7 rule - 68% fall within 1 standard deviation of mean - 95% within 2 standard deviations of mean - 99.7% fall within 3 standard deviations of mean
32
What is normal distribution?
A bell-shaped curve with many data points in the middle and few out in the tails
33
What is a z-score?
Measure how far an observation is from the mean in standard units (standardizing the observations) x - mean/standard deviation
34
What is the percentile?
A number r between 0 and 100, the rth percentile is a value such that r percent of observations fall at or below that value Ex: 20th percentile means at least 20% of data fall at or below that value and 80% of data fall at or above that value