5: The Standard Deviation As A Ruler And The Normal Model Flashcards

1
Q

What is the main purpose of using the standard deviation in statistics?

A

To judge how unusual a value is by how far it lies from the mean in standard deviation units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What seven events make up the women’s heptathlon in the Olympics?

A
  • 200 m run
  • 800 m run
  • 100 m high hurdles
  • Shot put
  • Javelin
  • High jump
  • Long jump
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which athlete won the 200 m run in the 2008 Olympics and what was their time?

A

Hyleas Fountain with a time of 23.21 seconds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What was Nataliya Dobrynska’s long jump distance in the 2008 Olympics?

A

6.63 metres.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How much farther was Dobrynska’s long jump compared to the mean distance for all contestants?

A

0.52 m farther.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the mean jump distance for the contestants in the long jump event?

A

6.11 m.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What was the standard deviation for the long jump event?

A

0.238 m.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How many standard deviations better than the mean was Dobrynska’s jump?

A

2.18 standard deviations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or False: Shorter times are better in the context of track events.

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: Statisticians use the standard deviation as a ______ throughout Statistics.

A

[ruler]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What type of display is used to show individual values in the context of the heptathlon?

A

Stem-and-leaf displays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In stem-and-leaf displays for the 200 m race, how are the stems oriented?

A

From faster to slower.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In stem-and-leaf displays for the long jump, how are the stems oriented?

A

From longer to shorter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the significance of using standard deviation in comparing performances?

A

It allows for comparison across different units and directions of measurement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the phrase ‘All models are wrong-but some are useful’ imply in statistics?

A

Models may not perfectly represent reality but can still provide valuable insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a z-score indicate?

A

It tells us how far a value is from its mean in terms of standard deviations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can you compare z-scores from different data sets?

A

No, comparisons must be made within the same data set or between values measured on the same scale.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How is a z-score calculated?

A

By subtracting the mean from the value and dividing by the standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the formula for calculating a z-score?

A

z = (X - μ) / σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does a z-score of -1.6 signify?

A

The data value is 1.6 standard deviations below the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does a z-score of 2 indicate?

A

The data value is two standard deviations above the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

True or False: Z-scores can have negative values.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are standardized values commonly denoted with?

A

The letter Z

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What happens to z-scores if the original variable’s units change?

A

Z-scores remain unaffected by changes in units.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Fill in the blank: To standardize a value, we subtract the _______ and then divide by the standard deviation.
mean
26
Which performance has a z-score of 2.18?
Dobrynska's long jump
27
What is the z-score for Fountain's 200 m race performance?
-2.14
28
What does a z-score measure?
The distance of a value from the mean in standard deviations.
29
How do z-scores help in comparing performances?
They standardize the performances to allow for comparison across different events.
30
What is the mean value for Fountain's performance?
6.11
31
What is the standard deviation for Dobrynska's performance?
0.700
32
What is the mean value for Dobrynska's performance?
24.71
33
How do z-scores reflect the unusualness of a performance?
The farther a data value is from the mean, the more unusual it is.
34
What are the two steps to finding a z-score?
1. Shift the data by subtracting the mean 2. Rescale the data by dividing by the standard deviation
35
What does shifting to adjust the center involve?
Subtracting the mean from the data
36
Which organization has been collecting health and nutritional information since the 1960s?
Centers for Disease Control and Prevention's National Center for Health Statistics
37
What was the sample size of the National Health Interview Survey (NHIS)?
Nearly 75,000 children
38
What did the National Health And Nutrition Examination Survey (NHANES) 2001-2002 measure?
A wide variety of variables, including: * Body measurements * Cardiovascular fitness * Blood chemistry * Demographic information
39
True or False: The NHANES 2001-2002 surveyed more than 11,000 individuals.
True
40
Fill in the blank: The NHIS interviewed nearly _______ children about their health.
75,000
41
What types of information did NHANES collect?
Health and nutritional information
42
What is the conversion factor from kilograms to pounds?
2.2 ## Footnote There are approximately 2.2 pounds in every kilogram.
43
What happens to the measures of position when a constant is added to every data value?
They increase (or decrease) by the same constant. ## Footnote This includes measures like the mean, percentiles, minimum, and maximum.
44
What is the effect of adding or subtracting a constant on the measures of spread?
It does not change the measures of spread. ## Footnote Measures of spread include the range, interquartile range (IQR), and standard deviation.
45
True or False: Adding a constant to each data value changes the shape of the distribution.
False ## Footnote The shape of the distribution remains unchanged; it just shifts.
46
If the mean weight of a group is 82.36 kg, what is the mean weight after subtracting 74 kg from each weight?
8.36 kg ## Footnote This indicates the average weight over the recommended maximum.
47
What are the two types of visual data representations mentioned?
* Histogram * Boxplot ## Footnote These are used to display the weight distributions.
48
Fill in the blank: Rescaling the data by multiplying each value by a constant changes the _______.
measurement units ## Footnote This refers to converting weights from kilograms to pounds.
49
What does adding a constant do to the entire distribution of data?
Shifts the distribution ## Footnote The entire distribution shifts without altering its shape.
50
What is the recommended maximum healthy weight according to the National Institutes of Health?
74 kg ## Footnote This weight is used as a baseline for comparing the weights of individuals.
51
What is the primary focus of the histograms shown in the content?
Weight distributions of men ## Footnote The histograms display the distribution of weights across different intervals.
52
What is the minimum score required on the paper-based TOEFL test (PBT) by SGS?
580
53
Which group of applicants must demonstrate English language proficiency through the TOEFL test?
Non-Canadian applicants
54
Standardizing data into z-scores involves which two main operations?
Shifting by the mean and rescaling by the standard deviation
55
What does standardizing into z-scores change about the data?
* Mean to 0 * Standard deviation to 1
56
True or False: Standardizing into z-scores changes the shape of the distribution.
False
57
What three aspects of a distribution are affected by standardizing?
* Shape * Centre * Spread
58
Fill in the blank: When we subtract the mean of the data from every data value, we shift the mean to _______.
zero
59
What happens to the standard deviation when each value is divided by the initial standard deviation?
It becomes 1
60
What is the effect of linear transformations on the shape of the distribution?
They do not change the shape
61
What type of transformations can change the shape of a distribution to something close to Normal?
Non-linear transformations
62
What is an example of a non-linear transformation?
* Log * Square root
63
Fill in the blank: Non-linear transformations change the measures of centre and spread in a _______ manner.
less predictable
64
What is the mean weight of the 80 NHANES participants before log transformation?
82.36 kg
65
How does taking logs of weights affect the distribution?
It becomes less skewed to the right
66
What must be done after applying non-linear transformations to compute new summary measures?
Re-compute them after transformation
67
What does a z-score indicate?
A z-score indicates how unusual a value is compared to the mean
68
What is the z-score of a data value that sits right at the mean?
0
69
What does a z-score of 1 indicate?
The data value is 1 standard deviation above the mean
70
What does a z-score of -1 indicate?
The data value is 1 standard deviation below the mean
71
What z-score is considered rare?
A z-score of 3 (plus or minus) or more
72
What percentage of data lies between the quartiles?
50%
73
In symmetric data, how does the standard deviation typically compare to the IQR?
The standard deviation is usually a bit smaller than the IQR
74
What is the purpose of a statistical model?
To model the frequency distribution of a quantitative variable
75
What is an example of a quantitative variable that could be modeled?
Weights of at-term newborns in grams
76
What does the fitted curve in a histogram represent?
The relative frequencies or proportions of the data
77
True or False: Models of data fit every data value exactly.
False
78
Fill in the blank: Without models, our understanding of the world is limited to only what we can say about the _______.
data we have at hand
79
What happens to the fitted curve if we observe more births in the example given?
The heights of the bars would roughly double, changing the fitted curve
80
Why are models considered useful in understanding reality?
They offer a simpler view of reality
81
What does a larger z-score (negative or positive) indicate?
The value is more unusual
82
What can we learn from models despite them not matching reality exactly?
They provide summaries that we can learn from and use
83
What is required to find the area under a curve?
Calculus ## Footnote The area under a curve is typically calculated using techniques from calculus.
84
What are cutpoints in the context of density curves?
They are not cutpoints or boundaries of bins ## Footnote In density curves, the smoothness allows for cutting anywhere on the weight-axis.
85
How can we approximate the area under a fitted curve?
By calculating the area between specified values ## Footnote For instance, between 2450 and 2750 grams.
86
What is the purpose of estimating proportions in data?
To understand data distribution in specified ranges ## Footnote This helps in analyzing the characteristics of the data set.
87
What type of curve is fitted to the data in Figure 5.6c?
Normal density curve ## Footnote This curve is used to represent the underlying distribution of the data.
88
What happens to the shape of the curve as more data is added?
It becomes smoother and closer to the true distribution ## Footnote The overall pattern stays about the same while the total area remains 1.00.
89
What does area equal in the context of relative frequency?
Area equals relative frequency ## Footnote This relationship is crucial for understanding density curves.
90
What is the formula for the area of a rectangle in this context?
Area = Width X Height ## Footnote Height represents the relative frequency of the bin.
91
What changes occur if the bin width is halved?
Relative frequencies decrease by about 50% ## Footnote This alters the visual representation of the data distribution.
92
What is meant by 'removing the dependence on the amount of data'?
Switching to relative frequencies ## Footnote This approach stabilizes the picture of the distribution.
93
Fill in the blank: The histogram height is equal to the _______.
Relative Frequency of Bin + Bin Width ## Footnote This equation helps define how the histogram is constructed.
94
What does the area under the density curve between two numbers a and b represent?
The proportion of the data that lies between a and b ## Footnote This is illustrated in Figure 5.6d.
95
What is the total area under the density curve?
1.00 ## Footnote This indicates that the curve covers 100% of the observations.
96
What conditions must a density curve satisfy?
* Always positive or zero * Total area under the curve above the x-axis equal to 1.00 ## Footnote These conditions ensure that the curve accurately models the distribution of data.
97
What is the shape of the Normal curve?
Bell-shaped ## Footnote The Normal curve is significant in statistical theory.
98
What parameters control the center and dispersion of the Normal distribution?
u and a ## Footnote These parameters are essential for defining the Normal curve.
99
True or False: The Normal curve is the only bell-shaped curve used in statistics.
False ## Footnote There are other bell-shaped curves utilized in statistical analysis.
100
Fill in the blank: The area under the density curve can be used to approximate _______.
percentages of the data falling in ranges of interest ## Footnote This application is crucial for statistical analysis.
101
What does the area under the density curve between two numbers a and b represent?
The proportion of the data that lies between a and b ## Footnote This is illustrated in Figure 5.6d.
102
What is the total area under the density curve?
1.00 ## Footnote This indicates that the curve covers 100% of the observations.
103
What conditions must a density curve satisfy?
* Always positive or zero * Total area under the curve above the x-axis equal to 1.00 ## Footnote These conditions ensure that the curve accurately models the distribution of data.
104
What is the shape of the Normal curve?
Bell-shaped ## Footnote The Normal curve is significant in statistical theory.
105
What parameters control the center and dispersion of the Normal distribution?
u and a ## Footnote These parameters are essential for defining the Normal curve.
106
True or False: The Normal curve is the only bell-shaped curve used in statistics.
False ## Footnote There are other bell-shaped curves utilized in statistical analysis.
107
Fill in the blank: The area under the density curve can be used to approximate _______.
percentages of the data falling in ranges of interest ## Footnote This application is crucial for statistical analysis.
108
What does the 68-95-99.7 Rule describe?
The distribution of values in a Normal model relative to the mean ## Footnote The rule states that approximately 68% of values fall within one standard deviation, 95% within two, and 99.7% within three standard deviations.
109
Approximately what percentage of values fall within one standard deviation of the mean in a Normal model?
68% ## Footnote This is part of the 68-95-99.7 Rule.
110
Approximately what percentage of values fall within two standard deviations of the mean in a Normal model?
95% ## Footnote This is part of the 68-95-99.7 Rule.
111
Approximately what percentage of values fall within three standard deviations of the mean in a Normal model?
99.7% ## Footnote This is part of the 68-95-99.7 Rule.
112
True or False: The 68-95-99.7 Rule applies only to Normal models.
True ## Footnote This rule is specifically applicable to the Normal distribution.
113
Fill in the blank: In a Normal model, approximately ______ of the values fall within one standard deviation of the mean.
68% ## Footnote This is part of the 68-95-99.7 Rule.
114
Fill in the blank: The 68-95-99.7 Rule states that approximately ______ of the values fall within three standard deviations of the mean.
99.7% ## Footnote This indicates that almost all values are included within this range.
115
What does a standard normal distribution represent?
A standard normal distribution falling below any specified z-score value ## Footnote The standard normal distribution is a normal distribution with a mean of 0 and a standard deviation of 1.
116
What is the purpose of a table of Normal percentiles (Table Z)?
To find the percentage of individuals in a normal distribution below a specified z-score ## Footnote Table Z allows for quick reference to the cumulative probabilities associated with z-scores.
117
How do you calculate a z-score?
Z = (X - μ) / σ ## Footnote Where X is the value, μ is the mean, and σ is the standard deviation.
118
What is the z-score for a TOEFL score of 648 if the mean is 540 and the standard deviation is 60?
1.80 ## Footnote Calculation: (648 - 540) / 60 = 1.80.
119
If a TOEFL score of 600 is one standard deviation above the mean, what is the implication?
It indicates that the score is better than 84% of the scores ## Footnote Approximately 68% of scores fall within one standard deviation of the mean in a normal distribution.
120
What percentage of people scored better than a TOEFL score of 660?
No more than 2.5% ## Footnote This is based on the properties of the normal distribution and z-scores.
121
True or False: A z-score of 1.80 indicates that the score is below the mean.
False ## Footnote A z-score of 1.80 indicates that the score is above the mean.
122
Fill in the blank: A score of _____ is easy to assess as one standard deviation above the mean.
600 ## Footnote This score represents a common benchmark in standardized testing.
123
What is the general rule for interpreting z-scores in terms of standard deviations?
Scores can be categorized as within one, two, or three standard deviations above or below the mean ## Footnote This categorization helps in understanding the relative performance of scores.
124
What is the range of scores for the middle 68% in a normal distribution?
Between one standard deviation below and one standard deviation above the mean ## Footnote This range includes approximately 68% of the data in a normal distribution.
125
What is one important model for variables that can take on only a few values?
Binomial and Poisson models ## Footnote These models are used for discrete data.
126
What type of data does the Normal model not account for?
Skewed data ## Footnote Skewed data requires different modeling approaches.
127
What technology tools can be used to find Normal percentiles?
Calculator, computer, smartphone, or tablet ## Footnote These tools facilitate calculations and data analysis.
128
What is a 'desert island' method in the context of finding Normal percentiles?
Using a Normal probability table when no technology is available ## Footnote This method might be necessary in extreme situations.
129
What percentage of test takers scored better than 648 on the TOEFL?
3.6% ## Footnote This percentage reflects a specific z-score calculation.
130
How do you find the z-score using a Normal probability table?
Look down the left column for the first two digits and across the top row for the third digit ## Footnote This method provides a systematic way to locate z-scores.
131
Fill in the blank: The Normal model is not the only model for _______.
data ## Footnote There are various models depending on data characteristics.
132
True or False: Most technology tools that find Normal percentiles can draw the distribution.
True ## Footnote Visual representation helps in understanding the distribution.
133
What is the percentile corresponding to a z-score of 1.8?
96.4% ## Footnote This means that 96.4% of z-scores are less than 1.8.
134
What is one important model for variables that can take on only a few values?
Binomial and Poisson models ## Footnote These models are used for discrete data.
135
What type of data does the Normal model not account for?
Skewed data ## Footnote Skewed data requires different modeling approaches.
136
What technology tools can be used to find Normal percentiles?
Calculator, computer, smartphone, or tablet ## Footnote These tools facilitate calculations and data analysis.
137
What is a 'desert island' method in the context of finding Normal percentiles?
Using a Normal probability table when no technology is available ## Footnote This method might be necessary in extreme situations.
138
What percentage of test takers scored better than 648 on the TOEFL?
3.6% ## Footnote This percentage reflects a specific z-score calculation.
139
How do you find the z-score using a Normal probability table?
Look down the left column for the first two digits and across the top row for the third digit ## Footnote This method provides a systematic way to locate z-scores.
140
Fill in the blank: The Normal model is not the only model for _______.
data ## Footnote There are various models depending on data characteristics.
141
True or False: Most technology tools that find Normal percentiles can draw the distribution.
True ## Footnote Visual representation helps in understanding the distribution.
142
What is the percentile corresponding to a z-score of 1.8?
96.4% ## Footnote This means that 96.4% of z-scores are less than 1.8.
143
What is another name for a Normal quantile plot?
Normal probability plot
144
What do the two trailing low values in a Normal probability plot indicate?
They correspond to values in the histogram that are lower than expected
145
What is the significance of a straight line in a Normal probability plot?
It indicates that the data is roughly Normal
146
What does a curved plot in a Normal probability plot suggest?
The data does not follow a Normal distribution
147
What rule may not be very accurate for certain data distributions?
68-95-99.7 Rule
148
What does systematic deviation from a straight line in a Normal probability plot imply?
The distribution is not Normal
149
How can one determine if a Normal model is appropriate for a dataset?
By checking the histogram and the Normal probability plot
150
What type of data distribution was assumed in the examples discussed?
Roughly unimodal and symmetric
151
Fill in the blank: A Normal probability plot is usually more effective at showing deviations from Normality than a _______.
histogram
152
True or False: A Normal probability plot can help decide whether a Normal model might work for a given dataset.
True
153
What does a Normal probability plot typically display?
It displays data on the X-axis and the corresponding Normal model on the Y-axis.
154
What is the purpose of making a histogram when analyzing a Normal probability plot?
To understand how the data differs from data expected from a Normal model.
155
What indicates that data may not be normally distributed on a Normal probability plot?
Clear curving patterns or random-looking squiggles.
156
When is it best to use technology for Normal probability plots?
When you can't easily look up values in tables.
157
What are the expected values in a Normal probability plot called?
Normal scores.
158
What does it indicate if the line in a Normal probability plot is straight?
The values match up well with the Normal distribution.
159
What happens to the plot if the distribution is skewed?
The plot will bend or show some type of curvature.
160
What is the significance of surprising points in a Normal probability plot?
They indicate values that do not line up well with the Normal model.
161
What does a Normal probability plot plot against the z-score?
Each data value.
162
Fill in the blank: If the distribution were perfectly Normal, the line would be _______.
straight.
163
What is the z-score of the fourteenth smallest fuel efficiency in the example?
-1.08.
164
What does the Normal model tell us about data values?
What value to expect for a given sample.
165
What was the smallest z-score for the author's Nissan car's fuel efficiency?
-3.16.
166
How does a Normal probability plot help in analyzing data?
By comparing observed data values against expected Normal scores.
167
What is the first step to create a 'Normal Percentile Plot' on the TI-83?
Set up a STAT PLOT using the last of the Types.
168
What command is used to find the percentage of a Normal model between two z-scores?
normalcdf(zLeft, zRight)
169
What is the command to find the z-score corresponding to a given percentile in a Normal model?
invNorm(percentile)
170
How do you indicate 'infinity' when finding Normal percentages on the calculator?
Use a very large z-score, such as 99.
171
What command would you use to evaluate the percentage of a Normal model over 2 standard deviations above the mean?
normalcdf(2, 99)
172
What do you need to do to make a Normal Probability plot on the TI-83?
Turn a STATPLOT On.
173
Which icon do you choose to create a Normal probability plot?
Choose the last of the icons.
174
What is the purpose of the ZoomStat function when creating a Normal Probability plot?
It does the rest of the plotting automatically.
175
Fill in the blank: To find the percentage of a Normal model from a certain z-score to infinity, use the command _______.
normalcdf(zScore, 99)
176
True or False: You can find Normal model values in a table instead of using a calculator.
True