3: Displaying And Summarizing Quantitative Data Flashcards

1
Q

What is the primary purpose of a histogram?

A

To display the distribution of a quantitative variable by representing counts as bars against bin values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How are bins used in the context of histograms?

A

Bins slice up all possible values of the quantitative variable and count the number of cases in each bin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the height of each bar in a histogram represent?

A

The number of cases that fall within the corresponding bin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the typical bin width range recommended for histograms?

A

Between 5 and 30 bins.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What might affect the features of a distribution in a histogram?

A

The choice of bin width.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

True or False: The default bin width is always chosen by the user in modern technology.

A

False.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fill in the blank: A histogram displays the _______ of earthquake magnitudes.

A

distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a common characteristic of the earthquake magnitudes displayed in the histogram?

A

Most are between 5.5 and 8.5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the values of the earthquakes associated with the Sumatra and Japan tsunamis?

A

9.1 and 9.0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the significance of the earthquake magnitudes of 9.1 and 9.0?

A

They are among the largest on record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the unit of measurement for the Richter scale?

A

Log dyne-cm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or False: The units of the Richter scale are often explicitly stated in modern contexts.

A

False.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a key consideration when choosing bin boundaries for a histogram?

A

They should be aesthetically pleasing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the typical range of magnitudes shown in the histogram?

A

Typically around 7.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What can be a consequence of choosing an inappropriate bin width?

A

It can lead to misinterpretation of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What might help a user avoid being misled by errors in data when creating a histogram?

A

Imagining what the distribution might look like before making the display.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fill in the blank: A histogram is a common type of display for the _______ of a quantitative variable.

A

distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens to features of the distribution as the bin width changes?

A

Some features appear or disappear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does a histogram with a smaller bin width of 0.2 reveal about tsunami data?

A

It shows a spike around magnitude 7.0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What historical values are included in the tsunami earthquake magnitudes?

A

Values more than 2000 years old

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why might there be an overabundance of values of 7.0 in the tsunami data?

A

Experts may have rounded values to 7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a gap in a histogram indicate?

A

There are no values in that bin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What was the range of values with no earthquakes in the histogram?

A

Between 4.6 and 4.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the purpose of a relative frequency histogram?

A

To replace counts with percentages of total cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How does the shape of a relative frequency histogram compare to a standard histogram?
The shape is exactly the same; only the vertical scale differs
26
What do histograms help summarize?
The distribution of a quantitative variable
27
What limitation do histograms have regarding data values?
They do not show the data values themselves
28
What is a stem-and-leaf display also known as?
Stemplot
29
Who devised the stem-and-leaf display?
John Tukey
30
What is a stem-and-leaf display?
A stem-and-leaf display is a method that shows individual values of data, similar to a histogram.
31
How is a stem-and-leaf display constructed?
By taking the tens place of a number as the 'stem' and the ones place as the 'leaf'.
32
What does the line 56 in a stem-and-leaf display represent?
It represents a pulse rate of 56 beats per minute (bpm).
33
In the stem-and-leaf display, what does the line 60444 indicate?
It indicates four pulse rates: one of 60 bpm and three of 64 bpm.
34
Why are stem-and-leaf displays particularly useful?
They are useful for batches of fewer than a few hundred data values and allow for quick display and recording of numbers.
35
What unusual observation can be made about the leaves in the pulse data?
All the leaves are even and are multiples of four.
36
True or False: A histogram shows individual data values.
False.
37
Fill in the blank: The nurse likely took the pulses by counting beats for a full minute or counting for ______ seconds and multiplying by four.
15
38
What is a key advantage of using a stem-and-leaf display over a histogram?
It allows for the visibility of individual values in the data.
39
What is the general shape of a stem-and-leaf display when viewed sideways?
It resembles the histogram of the same data.
40
What is a dotplot?
A simple display that places a dot along an axis for each case in the data. ## Footnote Dotplots are particularly useful for visualizing small data sets.
41
How does a dotplot compare to a stem-and-leaf display?
It is similar to a stem-and-leaf display but uses dots instead of digits for all the leaves. ## Footnote This makes dotplots easier for those who may forget how to write digits.
42
What is the primary purpose of a dotplot?
To display a small data set visually. ## Footnote Dotplots help in quickly understanding the distribution of data points.
43
What is the definition of mode in statistics?
The mode is the value that appears most frequently in a data set. ## Footnote Modes can be single (unimodal), two (bimodal), or multiple (multimodal).
44
What type of histogram has one peak?
Unimodal. ## Footnote An example would be the histogram of earthquake magnitudes showing a single mode.
45
What do we call histograms with two peaks?
Bimodal. ## Footnote A bimodal histogram indicates two different modes in the data.
46
What is the importance of the shape of a histogram?
The shape helps describe the distribution, its center, spread, and any unusual features. ## Footnote Understanding the shape aids in explaining data to others.
47
What condition must be checked before displaying data in a histogram?
Quantitative Data Condition. ## Footnote This ensures that the data represents values of a quantitative variable.
48
True or False: A bar chart can be used to display categorical data.
True. ## Footnote Bar charts are suitable for categorical data, while histograms are for quantitative data.
49
What is a stem-and-leaf display used for?
To visualize quantitative data while preserving the original data values. ## Footnote It allows for an easy way to see the shape of the distribution.
50
What do dotplots show about distribution?
Basic facts such as the slowest and quickest values in a data set. ## Footnote Dotplots can reveal clusters of data points.
51
Fill in the blank: A histogram with three or more peaks is called _______.
Multimodal. ## Footnote Multimodal distributions can indicate diverse data patterns.
52
What are pictographs?
Pictures used to represent data instead of dots. ## Footnote They can provide a visual summary of data but may lack precision.
53
What unusual feature was observed in the Kentucky Derby times?
Two clusters of winning times were found. ## Footnote One cluster was just below 160 seconds, and the other was around 122 seconds.
54
What type of data can be displayed in a histogram?
Quantitative data. ## Footnote Histograms are not suitable for categorical data.
55
What should be done before making a histogram?
Check the Categorical Data Condition. ## Footnote This ensures the data type is appropriate for a histogram.
56
What is a histogram where all bars are approximately the same height called?
Uniform histogram ## Footnote This type of histogram does not have a mode.
57
What does it mean for a histogram to be symmetric?
Values match closely when folded along a vertical line through the middle ## Footnote The edges of the histogram should align closely if it is symmetric.
58
What are the ends of a distribution called?
Tails ## Footnote The tails can indicate skewness in the distribution.
59
What does it mean if a histogram is skewed?
One tail stretches out farther than the other ## Footnote This indicates that the distribution is not symmetric.
60
How is the mode defined for categorical variables?
The single value that appears most often ## Footnote This is determined by counting the number of cases for each category.
61
Why is the mode for quantitative variables considered ambiguous?
It may not represent a single summary value ## Footnote For quantitative data, it is better understood as the peak of the histogram.
62
What was significant about the Kentucky Derby race times regarding the mode?
There were two distinct modes representing different versions of the race ## Footnote This suggests that the two versions should be considered separately.
63
Fill in the blank: The mode is sometimes defined as the _______ that appears most often.
single value ## Footnote This definition applies primarily to categorical variables.
64
What should you always mention in your data analysis?
Any stragglers or outliers ## Footnote Outliers can provide interesting insights or indicate errors that need addressing.
65
How can outliers affect data analysis?
They can affect almost every method discussed in the course ## Footnote Outliers may be the most informative part of your data or just errors.
66
What is an outlier?
A data point that stands off away from the body of the distribution ## Footnote Outliers should be treated specially and discussed in data presentations.
67
What should you do if you find an outlier?
Try to explain it and set it aside rather than letting it distort analysis ## Footnote Discussing outliers helps provide a clearer understanding of the true data story.
68
What is a gap in the distribution?
A space between groups of data points ## Footnote Gaps can indicate multiple modes and suggest the presence of different sources or groups in the data.
69
What rule of thumb will be learned later in the chapter?
A handy rule for deciding when a point might be considered an outlier ## Footnote This will assist in identifying significant deviations in data.
70
In the context of the Kentucky Derby data, what does a large gap indicate?
It indicates the presence of multiple groups of times ## Footnote This highlights the need to notice when data may come from different sources.
71
True or False: Outliers should always be discarded from data analysis.
False ## Footnote Outliers may contain valuable information and should be discussed.
72
What should you always mention in your data analysis?
Any stragglers or outliers ## Footnote Outliers can provide interesting insights or indicate errors that need addressing.
73
How can outliers affect data analysis?
They can affect almost every method discussed in the course ## Footnote Outliers may be the most informative part of your data or just errors.
74
What is an outlier?
A data point that stands off away from the body of the distribution ## Footnote Outliers should be treated specially and discussed in data presentations.
75
What should you do if you find an outlier?
Try to explain it and set it aside rather than letting it distort analysis ## Footnote Discussing outliers helps provide a clearer understanding of the true data story.
76
What is a gap in the distribution?
A space between groups of data points ## Footnote Gaps can indicate multiple modes and suggest the presence of different sources or groups in the data.
77
What rule of thumb will be learned later in the chapter?
A handy rule for deciding when a point might be considered an outlier ## Footnote This will assist in identifying significant deviations in data.
78
In the context of the Kentucky Derby data, what does a large gap indicate?
It indicates the presence of multiple groups of times ## Footnote This highlights the need to notice when data may come from different sources.
79
True or False: Outliers should always be discarded from data analysis.
False ## Footnote Outliers may contain valuable information and should be discussed.
80
What is the average tsunami-causing earthquake magnitude?
7.08
81
What Greek letter is used to denote 'sum'?
Sigma (Σ)
82
What is the formula for calculating the mean?
Add all values of the variable and divide by the number of data values, n
83
What is the value calculated when averaging data called?
Mean (y or y-bar)
84
How is the mean described in relation to a histogram?
It is the point at which the histogram balances
85
What happens to the mean when data is skewed or has outliers?
The mean may not represent the typical value well
86
What was the mean salary for the Toronto Maple Leaf players in the 2012-2013 season?
$1,996,143
87
True or False: The mean is always a good representation of a typical value in skewed distributions.
False
88
Fill in the blank: The mean is often denoted by _______.
y or y-bar
89
What does the variable x typically represent in statistical models?
Variables used to explain, model, or predict y
90
What is a common misconception about the term 'average'?
It may refer to the mean, but we don't average people
91
What is the significance of the balancing point in a histogram?
It indicates the mean value of the data distribution
92
What is the definition of the mean in a dataset?
The mean is the average of all values in the dataset.
93
What does it indicate if a distribution is skewed to the right?
It indicates that the mean is higher than the median due to high values on the right side.
94
What is the median in statistical terms?
The median is the value that splits the dataset into two equal halves.
95
How is the median calculated when the number of values (n) is odd?
The median is the middle value in the ordered list.
96
How is the median calculated when the number of values (n) is even?
The median is the average of the two middle values.
97
In the provided example, what is the median salary of the players?
$1,150,000.
98
What is a resistant measure in statistics?
A resistant measure is one that is not significantly affected by outliers.
99
Why might the mean not be a good summary of the center in a skewed distribution?
Because it can be heavily influenced by extreme values or outliers.
100
What is the balancing point in the context of mean and median?
The balancing point refers to the mean, which can be skewed by high or low values.
101
True or False: The median is always higher than the mean in a right-skewed distribution.
False.
102
Fill in the blank: The median is resistant to _______.
outliers.
103
What should be used to summarize salaries in the presence of outliers?
The median salary.
104
What is the ordered list of the given sample values: 14.1, 3.2, 25.3, 2.8, -17.5, 13.9, 45.8?
-17.5, 2.8, 3.2, 13.9, 14.1, 25.3, 45.8.
105
How do you find the median of the values: 14.1, 3.2, 25.3, 2.8, -17.5, 13.9, 35.7, 45.8?
Order the values and average the 4th and 5th values.
106
If the dataset has extreme values, which measure is generally preferred: mean or median?
Median.
107
What value is considered the 'middle value' when counting from both ends in an ordered dataset?
The median.
108
What does the term 'skewed distribution' refer to?
A distribution where values are not symmetrically distributed around the mean.
109
What is the effect of extreme values on the mean of a distribution?
Extreme values can pull the mean away from the center. ## Footnote This is similar to a see-saw where one heavy child can shift the balance.
110
What is usually a better descriptor of center for skewed distributions?
The median is usually a better descriptor. ## Footnote The median is less affected by extreme values compared to the mean.
111
What are the mean and median values for the 207 recent tsunami-causing earthquakes?
Mean is 7.10 and median is 7.20. ## Footnote This indicates that the data may not be significantly skewed.
112
When should the mean be preferred over the median?
When the distribution is unimodal and symmetric. ## Footnote The mean gives equal weight to all data values.
113
What should you do if a histogram is roughly symmetric and there are no outliers?
Prefer the mean as the measure of center.
114
What should you do if the histogram is skewed or has outliers?
Prefer the median as the measure of center.
115
What should you report if you're unsure whether to use mean or median?
Report both and discuss why they might differ.
116
What is the mean expenditure and median expenditure in the example provided?
Mean expenditure is $478.19 and median expenditure is $216.28.
117
Why is the median a more appropriate measure of center in the expenditure example?
Because the distribution of expenditures is skewed.
118
What does the median represent in a data set?
The middle value, with half the expenditures above and half below. ## Footnote Unlike the mean, the median is not affected by outliers.
119
When can we expect some variables to be skewed?
When values are bounded on one side but not the other.
120
Give examples of variables that are often skewed to the right.
* Incomes * Waiting times * Survival times * Amounts of things
121
What happens to the distribution if a test is too easy?
The distribution will be skewed to the left.
122
What can cause a skewed or bimodal distribution in a hockey team's salaries?
A small group of superstars with inflated salaries. ## Footnote Including these with typical players and minimum salary players can distort the distribution.
123
What is the difference between mean and median salary in the context of a sports team?
The median salary reflects what the 'typical' player earns, while the mean salary represents the average paid out per player, affected by star players. ## Footnote The mean can be significantly higher than the median due to a few high salaries, creating a right-skewed distribution.
124
Why might an investor be more interested in the mean of past returns rather than the median?
The mean of past returns is more relevant for long-term investment returns, as it aligns with the Law of Large Numbers, indicating future returns will approximate the mean. ## Footnote This principle assumes that future returns will be similar to past returns.
125
What is the mode in statistics?
The mode is the value that appears most frequently in a data set. ## Footnote In a bimodal distribution, there are two modes, which can be more informative than mean or median.
126
What is a trimmed mean?
A trimmed mean is calculated by removing a small percentage of the highest and lowest values before averaging the remaining data. ## Footnote This method helps reduce the influence of outliers on the mean.
127
How is the range defined in statistics?
The range is the difference between the maximum and minimum values in a data set. ## Footnote Formula: Range = max - min.
128
What does the standard deviation measure?
The standard deviation measures how spread out the values in a data set are around the mean. ## Footnote It provides a more comprehensive understanding of data variability than the range.
129
What is the formula for calculating the range?
Range = max - min. ## Footnote The range is sensitive to outliers, which can skew the interpretation.
130
What is a deviation in statistics?
A deviation is the difference between an individual data value and the mean of the data set. ## Footnote Examining deviations helps understand the spread of data.
131
Why do statisticians square deviations when calculating variance?
Squaring deviations prevents cancellation of positive and negative values and emphasizes larger differences. ## Footnote Squaring ensures all values are positive, which is essential for statistical calculations.
132
What is variance in statistics?
Variance is the average of the squared deviations from the mean. ## Footnote It is calculated by dividing the sum of squared deviations by n-1, not n, to account for sample size.
133
What is the relationship between variance and standard deviation?
The standard deviation is the square root of the variance, providing a measure of spread in the original units of the data. ## Footnote Variance is in squared units, which can be less intuitive.
134
What does a standard deviation of 0.77 indicate in the context of tsunami-causing earthquakes?
A standard deviation of 0.77 indicates how much individual earthquake magnitudes typically vary from the mean magnitude. ## Footnote This value is in Richter scale units.
135
What does the Empirical Rule state about data distribution?
The Empirical Rule states that: * About 68% of the data lies within 1 standard deviation of the mean. * About 95% of the data lies within 2 standard deviations of the mean. * Virtually all data lies within 3 standard deviations of the mean.
136
True or False: The average deviation is a useful measure of spread.
False. ## Footnote The average deviation is always zero because positive and negative deviations cancel each other out.
137
What does the empirical rule state about data distribution?
About 95% of data falls within 2 standard deviations from the mean.
138
What is the mean mass reported for the subjects in the study?
70 kg.
139
What is the standard deviation of the subjects' mass in the study?
10 kg.
140
According to the empirical rule, how many subjects had a mass less than 50 kg or more than 90 kg?
Likely none.
141
In the context of tsunami-causing earthquakes, what range of magnitudes corresponds to approximately 95% of cases?
Between 5.54 and 8.62.
142
True or False: The standard deviation is resistant to changes in small portions of the data set.
False.
143
What effect does an outlier have on the standard deviation?
It can greatly inflate its value.
144
What is the first step in finding the standard deviation?
Calculate the mean.
145
Fill in the blank: To find the variance, you divide the sum of squared deviations by _______.
n - 1.
146
What are the squared deviations for the values 4, 3, 10, 12, 8, 9, and 3?
* 9 * 16 * 9 * 25 * 1 * 4 * 16
147
What is the final standard deviation calculated from the values 4, 3, 10, 12, 8, 9, and 3?
3.65.
148
What rule can be used to make statements about an arbitrary distribution of any shape?
Chebyshev's Rule.
149
What is the formula for calculating the squared deviation for each value?
(y - y')^2.
150
How does the size of the data set affect the impact of an outlier on the standard deviation?
A larger data set dilutes the effect of an outlier.
151
What is the purpose of the Normal model in statistics?
It provides a complete description using mean and standard deviation.
152
What happens to the contribution of a squared deviation in a large data set?
It is swamped by other terms.
153
What is the third quartile (Q3) value for the earthquake magnitudes?
7.6 ## Footnote Q3 represents the 75th percentile of the data.
154
What does IQR stand for?
Interquartile Range ## Footnote IQR is a measure of statistical dispersion.
155
What is the median value of the earthquake magnitudes?
7.2 ## Footnote The median is the middle value in a data set.
156
How is the 5-number summary defined?
It reports the median, quartiles, and extremes of a distribution ## Footnote The 5-number summary includes minimum, Q1, median, Q3, and maximum.
157
What is the lower quartile (Q1) for the earthquake magnitudes?
3.0 ## Footnote Q1 represents the 25th percentile of the data.
158
What is the maximum value of the earthquake magnitudes given?
9.1 ## Footnote The maximum is the highest value in the data set.
159
What is the interquartile range (IQR) calculated for the data?
0.9 ## Footnote IQR is calculated as Q3 - Q1.
160
True or False: The lower quartile is the median of the upper half of the data.
False ## Footnote The lower quartile is the median of the lower half of the data.
161
Fill in the blank: The difference between the upper and lower quartiles is called the _______.
Interquartile Range ## Footnote IQR helps to understand the spread of the middle half of the data.
162
What happens when the data set is small regarding quartile calculations?
Different methods may yield differing quartile values ## Footnote Small data sets can lead to less consensus among statistical methods.
163
What is the significance of the quartiles in a data distribution?
They help to define the spread of the middle 50% of the data ## Footnote Quartiles provide a summary of the data's distribution.
164
In the context of quartiles, what does Q2 represent?
The median ## Footnote Q2 is the 50th percentile of the data.
165
What is the importance of reporting both quartiles and IQR?
They provide a comprehensive view of data spread ## Footnote This is particularly useful for understanding skewed distributions.
166
What does a skewed distribution imply about the quartiles?
The IQR may not be particularly useful as a descriptor ## Footnote Skewness can affect the interpretation of spread.
167
How do you calculate the median when the number of values is odd?
Exclude the median in both halves ## Footnote This method ensures accurate quartile calculations.
168
What should be included in a summary of earthquake magnitudes?
Minimum, Q1, median, Q3, maximum ## Footnote This forms the complete 5-number summary.
169
What is meant by 'the middle half' of the data?
The range between the lower and upper quartiles ## Footnote This represents the interquartile range.
170
What is the purpose of a boxplot?
To visually display the distribution of a quantitative variable using a 5-number summary ## Footnote A boxplot highlights key features such as quartiles and potential outliers.
171
What is the median magnitude of the tsunami-causing earthquakes discussed?
7 ## Footnote The median is the middle value in a sorted list of numbers.
172
What does the IQR stand for?
Interquartile Range ## Footnote The IQR is the difference between the upper quartile (Q3) and lower quartile (Q1).
173
How is the upper fence for a boxplot calculated?
Upper fence = Q3 + 1.5 IQR ## Footnote The upper fence helps identify potential outliers in the data.
174
What was the calculated upper fence for the earthquake magnitude data?
8.95 ## Footnote This was calculated using Q3 (7.6) and IQR (0.9).
175
What is the lower fence for a boxplot?
Lower fence = Q1 - 1.5 IQR ## Footnote The lower fence is used to identify outliers below the lower quartile.
176
What was the calculated lower fence for the earthquake magnitude data?
5.35 ## Footnote This was calculated using Q1 (6.7) and IQR (0.9).
177
What does a boxplot represent when the median is centered between the quartiles?
The middle half of the data is roughly symmetric ## Footnote This indicates that the distribution is balanced.
178
What does it indicate if the median is not centered in a boxplot?
The distribution is skewed ## Footnote Skewness refers to the asymmetry of the distribution of values.
179
What should be included in a boxplot to represent potential outliers?
Special symbols for data values lying beyond the fences ## Footnote These values are referred to as suspect outliers.
180
How many earthquakes were analyzed in the data?
207 ## Footnote The total number of cases provides context for the analysis.
181
What is the range of magnitudes for the middle half of the earthquakes?
Between 6.7 and 7.6 ## Footnote This range is determined by the quartiles.
182
Fill in the blank: The boxplot is constructed using _______ to represent the quartiles.
horizontal lines ## Footnote These lines are drawn at the lower and upper quartiles and at the median.
183
True or False: The whiskers in a boxplot connect to all data values.
False ## Footnote Whiskers only connect to the most extreme data values within the fences.
184
What does the height of the box in a boxplot represent?
The IQR ## Footnote The height indicates the range of the middle 50% of the data.
185
What does a boxplot visually highlight about the data?
Central tendency, spread, and potential outliers ## Footnote These features help in understanding the distribution of the data.
186
What do boxplots display about data?
Boxplots display the distribution of data, including median, quartiles, and potential outliers. ## Footnote Boxplots help visualize the spread and symmetry of data distributions.
187
What is indicated by the shape of the central box in a boxplot?
The shape of the central box indicates the symmetry of the data distribution. ## Footnote A roughly symmetrical box suggests a balanced distribution around the median.
188
What does a longer lower whisker in a boxplot suggest?
A longer lower whisker suggests that the distribution stretches out slightly at the lower end. ## Footnote This can indicate a skewness in the data towards lower values.
189
What range of earthquake magnitudes does the central box represent in the tsunami-causing earthquake data?
The central box represents earthquakes with magnitudes between 6.7 and 7.6 on the Richter scale. ## Footnote This range captures the central tendency of the earthquake data.
190
True or False: Boxplots are ineffective at identifying outliers.
False ## Footnote Boxplots are particularly good at pointing out possible outliers.
191
Fill in the blank: Boxplots encourage you to give special attention to _______.
[outliers] ## Footnote Outliers may represent errors or significant cases within the dataset.
192
What does the presence of very small magnitude earthquakes in the data suggest?
The presence of very small magnitude earthquakes indicates variability in the dataset. ## Footnote These small values can provide context for the range of earthquake activity.
193
What are outliers in the context of boxplots?
Outliers are data points that fall outside the typical range of the dataset. ## Footnote They can be either errors or noteworthy cases that deserve further investigation.
194
What event caused a significant shutdown of air travel in September 2001?
The attacks of September 11
195
What is a fundamental concept in statistics that relates to variation?
Spread
196
How do measures of spread help in statistics?
They help to be precise about what we don't know.
197
What happens to the IQR and standard deviation if many data values are scattered far from the center?
They will be large.
198
What happens to measures of spread if data values are close to the center?
They will be small.
199
If all data values are exactly the same, what would the measures of spread be?
Zero.
200
Why do we always report a spread along with any summary of the center?
Measures of spread tell how well other summaries describe the data.
201
What was the median used to describe in the example regarding credit card expenditures?
The center of the distribution.
202
What are the quartiles mentioned in the example for credit card expenditures?
$73.84 and $624.80
203
What is the IQR and why is it a suitable measure of spread?
The IQR is a measure of spread that indicates the range within which the middle 50% of data values lie.
204
What is the first step to analyze a quantitative variable?
Make a histogram or stem-and-leaf display and discuss the shape of the distribution.
205
What should be discussed after the shape of the distribution?
The centre and spread.
206
What should be reported if the shape of the distribution is skewed?
The 5-number summary, the median, and the IQR.
207
When discussing the mean and median, what should be pointed out?
Why the mean and median differ.
208
What should be reported for symmetric shapes?
The mean and standard deviation, and possibly the median and IQR.
209
In unimodal symmetric data, how does the IQR usually compare to the standard deviation?
The IQR is usually a bit larger than the standard deviation.
210
What should be done if the IQR is not larger than the standard deviation?
Check if the distribution is skewed and if there are any outliers.
211
What pairs should always be reported together?
Median with IQR and mean with standard deviation.
212
Why is it dangerous to report a centre without a spread?
It gives a false sense of knowledge about the distribution.
213
What should be discussed regarding unusual features in the data?
Any unusual features should be pointed out.
214
What should be done if there are multiple modes in the data?
Try to understand why and consider splitting the data into separate groups.
215
When reporting mean and standard deviation, what should be included regarding outliers?
Report them with the outliers present and with the outliers omitted.
216
How are the median and IQR affected by outliers?
They are not affected very much by the outliers.
217
What is the first step to create a histogram on a TI-83/84 PLUS?
Turn a STATPLOT on.
218
What must you choose to create a histogram after turning on STATPLOT?
Choose the histogram icon and specify the List where the data are stored.
219
What command do you use to adjust the viewing window after setting up a histogram?
ZoomStat, then adjust the WINDOVW appropriately.
220
Which option from the STAT CALC menu is used to calculate summary statistics?
1-VarStats.
221
What must you specify to calculate summary statistics?
The List where the data are stored.
222
How can you view the 5-number summary when calculating summary statistics?
Scroll down.
223
What icon do you use to create a boxplot on a TI-83/84 PLUS?
Box plot icon.
224
If the data are stored as a frequency table, what do you set for Xlist and Freq?
Xlist: L1 and Freq: L2.