Module 3 Notes - Numerical Descriptive Measures Flashcards

1
Q

The _______ ________ is the extent to which the values of a numerical variable group around a typical or central value.

A

central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

the _________ is the amount of dispersion or scattering away from a central value that the values of a numerical variable show

A

variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

the _____ is the pattern of a distribution of values from the lowest to the highest value

A

shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Arithmetic Mean

A

A= \frac {1}{n} \sum \limits_{i=1}^n a_i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Middle value in the ordered array

A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Most frequently observed value

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the __________ ____ (often just called “mean”) is the most common measure of central tendency.
*For a sample of size n (lower case n):

A

arithmetic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

*The most common measure of _______ ________.
*____ = sum of values divided by the number of values
*Affected by extreme values (outliers).

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

*In an ordered array, the ______ is the “middle number (50% above, 50% below)
*less sensitive than the mean to extreme values

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Locating the Median
*The location of the median when the values are in numerical order (smallest to largest):
*If the number of values is odd, the media is the middle number
*If the number of values is even, the media is the average of the two middle numbers

A

Median Position = n+1/2 position in the ordered data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

*Value that occurs most often
*Not affected by extreme values.

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Range, Variance, Standard Deviation, Coefficient of Variation
-Measures of _________ give information on the spread or variability or dispersion of the data values

A

Measures of Variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

*Simplest measure of variation.
*Difference between the largest and smallest value

A

Range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

*Does not account for how the data are distributed.
*Sensitive to outliers

“Why the _____ can be misleading”

A

range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

*Average (Approx.) of squared deviations of values from the mean.

A

Sample Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

*Most commonly used measure of variation.
*Shows variation about the mean.
*Is the square root of the variance.
*Has the same units as the original data.

A

Sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Steps for computing _________ _________
1. Computer the difference between each value and the mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample variables.
5. Take the square root of the sample variance to get the sample ________ _________

A

standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

*Measures relative variation.
*Always in percentage (%)
*Shows variation relative to mean.
*Can be used to compare the variability of two or more sets of data measured in different units.

A

The Coefficient of variation (Standard Deviation / Mean) * 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Locating Extreme Outliers: _-_____
Z=X-x̄/S
Where X represents the data value
x̄ is the sample mean
S is the sample standard deviation

A

Z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

*Suppose the mean math SAT score is 490, with a standard deviation of 100.
*Computer the Z-score for a test score of 620.
(Z=x-x̄/s)=(620-490/100)=(130/100)=1.3
-A score of 620 is 1.3 standard deviations above the mean and would not be considered an outlier.
*A data value is considered an extreme outlier if its Z-score is less than -3.0 or greater than +3.0

A

Z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The more data are spread out, the greater the _____, ________, and ________ __________.

A

range, variance, standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

The more data are concentrated, the smaller the _____, ________, and ________ _________.

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

If the values are all the same (no variation) all these measures will be zero

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

None of these measures are ever in negative.

A

range, variance, and standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

The larger the absolute value of the _-_____, the farther the data value is from the mean.

A

Z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

*Measures the extent to which data values are not symmetrical.

27
Q

*Measures the peakedness of the curve of the distribution -that is how sharply the curve rises approaching the center of the distribution.

28
Q

Measures the extent to which data is not symmetrical.

29
Q

Mean < Median

A

Left-Skewed

30
Q

Mean = Median

31
Q

Median < Mean

A

Right Skewed

32
Q

Sharper peak than bell-shaped (Kurtosis > 0)

A

Leptokurtic

33
Q

Bell-shaped (Kurtosis = 0)

A

Mesokurtic

34
Q

Flatter than bell-shaped (Kurtosis < 0)

A

Platykurtic

35
Q

*Can visualize the distribution of the values for a numerical variable by computing:
*The _________
*The five-number _______.
*Constructing a _______.

A

quartiles, summary, boxplot

36
Q

_________ split into 4 segments with an equal number of values per segment.

37
Q

*the first ________ Q1, is the value for which 25% of the values are smaller than 75% are larger.

38
Q

*Q_ is the same as the media (50% of the values are smaller and 50% are larger).

39
Q

*Only 25% of the values are greater than the third quartile.

40
Q

Find a ________ by determining the value in where the appropriate position in the ranked data

41
Q

_____ quartile position: Q1 = (N+1)/4
where n is the number of observed values

42
Q

______ quartile position: Q2=(n+1)/2
where n is the number of observed values

43
Q

_____ quartile position: Q3 = 3(n+1)/4
where n is the number of observed values

44
Q

The ___ is Q3-Q1 and the measures the spread in the middle 50% of the data.

45
Q

The ___ is also called the midspread because it covers the middle 50% of the data.

46
Q

-measure of variability that is not influenced by outliers or extreme values.

47
Q

Measures like Q1, Q3, and IQR that are not influenced by outliers are called _________ _______.

A

resistant measures

48
Q

*Range is the difference between the smallest values
*IQE is

49
Q

The five numbers that describe center, spread, and shape of data are:
*Xsmallest
*First Quartile (Q1)
*Median (Q2)
*Third Quartile (Q3)
*Xlargest

A

Five number Summary

50
Q

The _______: A graphical display of the data based on the five-number summary

A

Boxplot, Xsmallest – Q1 – Median – Q3 – Xlargest

51
Q

(If the data are symmetric around the median then the box and central line are centered between the endpoints
*A _______ can be shown in either a vertical or horizontal orientation

52
Q

*The __________ mean is the sum of the values in the population (not the sample) divided by the population size, N (not the sample size)

A

Population mean

53
Q

μ

A

population mean

54
Q

Population mean equation: N

A

Population size (Capital N)

55
Q

Population mean equation: Xi

A

ith value of the variable X

56
Q

Average of squared deviation of values from the population mean.

A

Population variance.

57
Q

*Most commonly used measure of variation.
*Shows variation about the mean.
*Is the same square root of the population variance.
*Has the same units as the original data.

A

The Standard Deviation σ

58
Q

Mean: μ
Variance: σ^2
Standard Deviation: σ

A

Population Parameter Measure

59
Q

Mean: X
Variance: S^2
Standard Deviation: S

A

Sample Statistic Measure

60
Q

*The _________ ____ approximates the variation of data in a symmetric mound-shaped distribution.
*Approximately __% of the data in a symmetric mound shaped distribution is within 1 standard deviation of the mean or μ ± 1 σ

A

The Empirical Rule

61
Q

approximately __% of the date in a symmetric mound-shaped distribution lies within two standard deviations of the mean, or μ ± 2σ

62
Q

approximately __% of the date in a symmetric mound-shaped distribution lies within three standard deviations of the mean, or μ ± 3σ

63
Q

______ plots allow you to visually examine the relationship between two numerical variables and now we will discuss two quantitative measures of such relationships.
*The Covariance
*The Coefficient of Correlation.

A

Scatter plots

64
Q

*The ___________ measures the strength of the linear relationship between two numerical variables (X&Y)