Chapter 3 Flashcards

1
Q

3 main “measures of center”

A
  1. Mean
  2. Median
  3. Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Mean

A

Obtained by dividing the sum of all values by the number of values in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Median

A

The value that divides a data set that has been sorted in increasing order into two equal halves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mode

A

The value that occurs w/ the highest frequency in a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Mean for population data

A

u = sum on all x’s / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mean for sample data

A

X bar = sum of all x’s /n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 steps to calculate the median

A
  1. Sort the data set into increasing order
  2. Find the value that divides the sorted data set in two equal parts.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can there be no mode?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can modes be from qualitative data?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can there be more than one mode?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the mean, median, and mode of a symmetrical histogram /distribution curve

A

Mean = median = mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mean, median, and mode of a right-skewed histogram

A

Mean > median > mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Left-skewed histogram mean, median, and mode

A

Mean < median < mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Trimmed mean

A

After we drop K% of the values from each end of a ranked data set, the mean of the remaining values is called the K% trimmed mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Weighted mean

A

When each value of a data set is assigned a different weight.
Sum of x* W/ sum of W

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Measures of dispersion tell us…

A

How much variation exists around that “typical value”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

3 main measures of dispersion

A
  1. Range
  2. Variance
  3. Standard deviation
18
Q

Range

A

The difference between the largest value and the smallest value.

19
Q

Variance

A

A measure of how much the values in a dataset differ from the mean.

20
Q

Standard deviation

A

A measure of the average distance of each data point from the mean. The square root of variance

21
Q

Range formula

A

Largest value - smallest value

22
Q

Disadvantages of range

A
  1. Only based on 2 values
    2, affected by outliers
23
Q

Can the variance and the standard deviation be negative?

A

No

24
Q

Units for standard deviation

A

Same as the original units

25
Q

Units for variation

A

The square of the original data’s units

26
Q

Coefficient of variation ( CV )

A

A measure of relative variability.
Useful if you are comparing the variation of two datasets w/ different magnitudes of value.

27
Q

Variance and SD depend on…

A

The units of measurement

28
Q

Coefficient of variation units

A

Expressed as a percentage of the mean.
Has no units and is always expressed as a percentage.

29
Q

Coefficient of variation formula

A

100 x SD / mean

30
Q

Mean of grouped data for population data

A

u = sum of frequency x midpoint /n

31
Q

Mean of grouped data for sample data

A

X bar = sum of frequency x midpoint /n

32
Q

Standard deviation

A

A measure of the average distance of each data point from the mean.

33
Q

ChebyShev’s theorem

A

For any number k greater than 1, at least ( 1-1 / k^2 ) of the data values lie within K standard deviations of the mean

34
Q

ChebyShey’s theorem works for…

A

Any distribution shape

35
Q

Empirical rule

A

If our distribution is a “bell-shaped”or “normal” or “Gaussian” we use the empirical rule.
68% of observations lie w/ in one standard deviation of the mean
95% of the observations lie with in 2 SDs of the mean
99. 7% of the observations lie with in 3 SDs of the mean

36
Q

Quartile

A

Three summary measures ( Q1, Q2, Q3 ) that divide a ranked data set into four equal parts.
Q2 is the same as the median.
Splits the data into 4 sections. (Each contains 25% of the observations of a data set)

37
Q

Interquartile range (irq)

A

The difference between Q3 and Q 1
IRQ = Q3 - Q1
Another measure of dispersion.
Small IRQ = less spread out data
Large IRQ = more spread out data

38
Q

Percentiles

A

99 summary measures that divide a ranked data set into 100 equal parts.
Each portion contains 1% of the observations of a data set.

39
Q

The (approximate) value of the K Th percentile is sample of size N is:

A

Pk = value of the ( kn / 100 ) Th term in a ranked data set
Always round the position up.

40
Q

Given a certain number in a set and find its percentile.

A

Percentile = number of values less than k / total number of values in the data set X 100%

41
Q

Box-and-whisker plot

A

Shows 5 measures:
1. Median
2.Q1
3. Q3
4. Minimum
5. Maximum

Lower inner fence = Q1 - 1.5x IQR
Upper inner fence = Q3 + 1.5x1QR

Outliers are plotted outside the fences.