CHAPTER 15 Turn Statistics into Substance Flashcards
What is a common issue with how statistics are reported?
Statistics are often reported or presented in ways that are misleading or unhelpful for decision making.
What is necessary to translate a statistic into useful information?
Think clearly about the question at hand.
What does Bayes’ rule help us with?
Bayes’ rule helps us update beliefs in response to new information.
What must be combined with evidence-based beliefs for decision making?
Your values.
What is the key to turning statistics into substance?
Ask and answer the question you really care about.
How can the presentation of information affect its perceived meaning?
Changing scales can dramatically alter whether a relationship seems large or small, important or unimportant.
What is the difference between miles-per-gallon and gallons-per-mile?
Miles-per-gallon tells how far a car drives given how much gasoline is burned, while gallons-per-mile indicates how much gasoline is burned given how far it is driven.
In terms of reducing gas consumption, which vehicle type should be prioritized?
Gas-guzzling SUVs.
What is one consequence of using miles-per-gallon as a metric?
It may mislead consumers and regulators into making bad decisions.
What is a better measure of fuel efficiency than miles-per-gallon?
Gallons-per-hundred-miles.
What confusion can arise from percent changes versus percentage point changes?
A percent change is a ratio of the percentage point change to the initial value, while percentage point change is the numerical difference between two percentages.
What does a 44% reduction in heart-related risks imply without context?
It does not necessarily indicate a significant number of lives saved.
What is the baseline heart attack risk in the studied population?
About 2.8%.
What is the actual reduction in heart attacks with a 44% decrease?
About one percentage point reduction.
Why are visual presentations of data important?
They help in accurately and informatively displaying quantitative information.
What is essential when creating data visualizations?
Choosing the scale on which to present data.
What can a seemingly innocuous change of scale in a graph do?
Transform a graph to make a relationship look enormous or inconsequential.
What should be questioned when viewing a data visualization?
The underlying data and analyses, assumptions, and whether the findings answer the question being asked.
Fill in the blank: Quantitative evidence, on its own, can’t tell you what to ______.
do.
Fill in the blank: The same 2-miles-per-gallon improvement has a much larger effect on gas consumption when applied to a ______ vehicle.
gas-guzzling.
What is the significance of choosing the scale for data visualization?
Choosing the scale can significantly alter the perception of the data, making relationships appear large or small.
What effect does changing the vertical axis scale have in a bar graph comparing numbers 89 and 90?
It can make the two numbers appear vastly different or nearly identical.
What should you consider when interpreting a graph?
Carefully read the axes and consider what the numbers mean substantively.
True or False: There is always a correct scale for data visualization.
False
In what context might the difference between 89 and 90 be substantively significant?
If you are chaperoning school children, the difference in students returning home safely is significant.
What happens if a graph is on a scale too large?
Important information may be hidden, making substantively meaningful differences difficult to see.
If a 1-point difference is substantively negligible, what scale would appropriately reflect that?
A scale of 0 to 100.
What can changing the scale of axes in a graph influence?
It can make correlations look strong or weak and affect the appearance of linear relationships.
What did Achen and Bartels argue regarding voters’ policy views and political behavior?
They argued that policy views have little relationship to political behavior, which is driven by non-policy concerns.
In their analysis, what did Achen and Bartels use to support their claim about party affiliation?
A visual representation of data showing trends in party identification among white Southerners.
What was the trend in party identification for white Southerners from 1960 to the end of the twentieth century?
They shifted from overwhelmingly Democratic to overwhelmingly Republican.
What does the vertical axis in the visualization of party identification measure?
The Democratic margin, calculated as the percent identifying as Democratic minus the percent identifying as Republican.
What did the figure’s large vertical axis scale potentially obscure?
Substantively meaningful differences in party identification trends.
How did the trends in partisanship differ between those who opposed and did not oppose integration?
Those who opposed integration switched partisan affiliation at a faster rate.
What was the margin change for white Southerners who opposed integration from 1962 to 2000?
From a 48-point Democratic margin to an 18-point Republican margin.
What was the change in the Democratic margin for those who did not oppose integration during the same period?
From a 32-point Democratic margin to a 1-point Republican margin.
Fill in the blank: To effectively convey information in data visualizations, one must keep it _______.
simple
What should be the primary focus when creating data visualizations?
Conveying substantive information clearly.
When is it appropriate to use a figure instead of a table?
When the figure conveys more information than a table would.
What is one way to convey uncertainty in data visualizations?
By showing distributions, standard errors, or confidence intervals.
What is Bayes’ rule used for?
To integrate new quantitative information into existing knowledge.
In the example of Juanita Brooks, what evidence was used to charge the Collins couple?
Eyewitness testimony about their characteristics and the yellow car.
What probability did the mathematician conclude regarding the innocence of the Collins couple?
About a 1 in 12 million chance that they were innocent.
How do you calculate the probability of multiple independent events occurring together?
By multiplying the probabilities of each event occurring individually.
How many cards are in a standard deck?
52 cards
What was the prosecutor’s argument regarding the Collins couple’s characteristics?
The chances that a randomly selected person would have specific characteristics is the product of the probabilities of those characteristics.
What probability did the prosecutor initially calculate for the Collins couple?
1 in 12 million
What did the prosecutor’s analysis underestimate regarding the Collins couple’s probability of innocence?
The probability was likely closer to 1 in 1 billion.
True or False: The characteristics used in the prosecutor’s argument are independent.
False
What is the correct probability to consider when evaluating the Collins couple’s guilt?
The probability that the Collins couple is innocent, given the evidence.
What is conditional probability?
The probability of one event occurring given that another event has occurred.
In the context of the Collins case, what does ‘P(innocent | evidence)’ represent?
The probability that the Collins couple is innocent given that they match the eyewitness description.
What is Bayes’ Rule?
A mathematical formula for calculating the probability of a claim being true, given available evidence.
What does the prior belief represent in Bayes’ Rule?
The probability of a claim being true before considering new evidence.
What does the posterior belief represent in Bayes’ Rule?
The probability of a claim being true after incorporating new evidence.
How did the prosecutor ignore an important factor in the Collins case?
He focused only on the new evidence, neglecting the prior probability of innocence.
What was the prior belief regarding the Collins couple’s innocence?
Very close to 1, since almost all couples in LA were innocent.
What is the significance of the probability 1 in 1,000,000 in the Collins case?
It represents the likelihood that an innocent couple matches the eyewitness description.
According to the analysis, how many innocent couples in Los Angeles matched the eyewitness description?
2 innocent couples
Fill in the blank: The probability the Collins couple is guilty given that they match the eyewitness description is ______.
1 in 3
Why does the Collins couple have a higher probability of being innocent than guilty?
Because out of the three couples matching the description, two are innocent.
What was the main error in the prosecutor’s argument?
He answered the wrong question regarding the guilt of the Collins couple.
What was the false negative rate for Test 1 regarding celiac disease?
5 percent
What was the false positive rate for Test 2 regarding celiac disease?
50 percent
What percentage of kids without celiac tested positive on Test 2?
50 percent
If 10,000 kids are tested, how many would be expected to have celiac disease based on the prior belief?
100 kids
What is the probability that a kid tests positive on Test 2, given that they have celiac disease?
80 percent
What is the probability Abe has celiac, given that he was small in stature and tested positive on Test 2?
Approximately 1.6 percent
What is the significance of the false positive and false negative rates being independent?
It allows for the combination of probabilities across tests.
How does the story of Abe’s diagnosis illustrate the use of Bayes’ Rule?
It shows how to update beliefs based on new evidence and prior probabilities.
What is the probability of a false negative on Test 1 for a kid with celiac?
5 percent
This means that Test 1 returns a negative result for a child with celiac disease 5% of the time.
What is the probability of a positive result on Test 2 for a kid with celiac?
80 percent
This indicates that Test 2 returns a positive result for a child with celiac disease 80% of the time.
What is the prior belief regarding the probability that a kid has celiac disease?
1 percent
This is the initial assumption about the prevalence of celiac disease among the kids in question.
What are the two types of kids who might get a negative on Test 1 and a positive on Test 2?
- Kid with celiac disease
- Kid without celiac disease
The first type has celiac and experiences a false negative on Test 1, while the second type does not have celiac and experiences a false positive on Test 2.
What was the probability that Abe had celiac given the two test results?
Approximately 1 in 1,000
This illustrates how Bayesian reasoning can lead to surprising conclusions based on test results.
What was one major new program implemented by the U.S. government in airport security after 9/11?
Screening of Passengers by Observation Techniques (SPOT)
This program aimed to catch potential terrorists using behavioral cues.
What do Behavior Detection Officers look for during airport security screenings?
Indicators of nervousness or suspicious behavior
Different suspicious behaviors were assigned different points to determine if a traveler should be further questioned.
What percentage of the TSA’s annual budget was allocated to the SPOT program by 2010?
5 percent
This amounted to hundreds of millions of dollars per year.
What crucial information does the TSA need to determine the likelihood of a traveler being a terrorist?
- Likelihood of a random traveler being a terrorist
- Likelihood of a terrorist appearing suspicious
- Likelihood of a non-terrorist appearing suspicious
These factors are necessary to form accurate posterior beliefs about a traveler’s risk.
What did the General Accountability Office (GAO) find regarding the TSA’s knowledge about the SPOT program?
The TSA does not know the answers to key questions about terrorist behavior
This lack of knowledge undermines the efficacy of the SPOT program.
What was the most common reason for detaining individuals identified by a Behavior Detection Officer?
Undocumented immigration status
This indicates a failure of the SPOT program to catch actual terrorists.
How many passenger trips occur through U.S. airports each year according to the GAO?
Approximately 2 billion
This figure highlights the scale of air travel and the challenge of identifying potential terrorists.
How many would-be terrorists were assumed to be in U.S. airports each year for the SPOT program analysis?
100
This is a generous estimate used in the analysis to evaluate the SPOT program.
What is the probability that a person who behaves suspiciously is actually a terrorist, according to the SPOT program assumptions?
Approximately 1 in 200,000
This reflects the very low likelihood of suspicious behavior indicating terrorism even under favorable assumptions.
What does Bayes’ rule help us understand regarding scientific hypotheses?
It helps assess confidence about the truth of a hypothesis based on new evidence
Bayes’ rule provides a structured way to update beliefs in light of new data.
What is the significance level used in hypothesis testing, often set at?
0.05
This threshold indicates a 5% chance of incorrectly rejecting the null hypothesis.
What is statistical power in hypothesis testing?
The probability of finding a statistically significant result given that a relationship exists
High statistical power indicates a greater chance of detecting an effect when it truly exists.
What is the impact of low prior beliefs on posterior beliefs after obtaining statistically significant results?
They lead to low posterior beliefs about the effect being real
This illustrates the importance of prior beliefs in interpreting statistical results.
What role do prior beliefs play in posterior beliefs?
Prior beliefs are crucial for shaping posterior beliefs, especially in studies with low prior probabilities like ESP.
If prior beliefs are low, new evidence may not significantly affect beliefs.
How does the strength of prior beliefs affect belief change in response to new evidence?
Stronger prior beliefs (close to 0 or 1) make it harder to change beliefs in response to new evidence, while moderate priors (around 0.2) allow for larger shifts.
This is illustrated in Figure 15.7.
True or False: Two individuals with different prior beliefs will react the same way to the same piece of evidence.
False
Different prior beliefs can lead to different interpretations of the same evidence.
What is Bayesian statistics?
Bayesian statistics involves specifying the whole prior distribution of beliefs about possible relationship sizes and updating these beliefs when new evidence is presented.
This contrasts with frequentist statistics.
What is the difference between percentage point change and percent change?
Percentage point change is the numerical difference between two percentages, while percent change measures the degree of change relative to the original value.
Percent change is sensitive to the original value.
What factors should be considered when evaluating diagnostic tests?
Factors include:
* False positive rates
* False negative rates
* Costs of the tests
* Speed of the tests
These factors are crucial for making informed decisions about testing.
What is the significance of false positive and false negative rates in testing?
Both rates are critical for accurate diagnosis, with low rates being essential for reliable testing outcomes.
Regulatory agencies often require low rates for test approval.
Fill in the blank: Bayesian reasoning can be complicated when assessing the ______ of a relationship.
[magnitude]
This involves beliefs about how likely each possible relationship size is.
What is a potential drawback of focusing solely on one quantitative statistic like false negative rates?
Focusing on one statistic can lead to poor decision-making, as it may overlook other important factors and trade-offs.
A comprehensive evaluation of costs and benefits is necessary.
Why might paper-strip tests be considered despite having higher false negative rates?
They are cheaper, can be administered at home, and provide faster results, which are critical in controlling the spread of infectious diseases.
Speed and cost are significant in the context of highly infectious diseases like coronavirus.
What is the main benefit of rapid testing in the context of infectious diseases?
The main benefit is to prevent infected individuals from spreading the disease to others quickly.
Rapid testing can significantly reduce transmission rates.
How can combining different testing methods improve diagnostic accuracy?
Using a combination of tests allows for quick initial screening with cheaper tests and follow-up with more accurate tests to confirm results.
This approach can minimize the impact of false positives.
What is the significance of understanding the costs and benefits in decision-making?
Understanding costs and benefits is essential for making informed decisions based on quantitative evidence, rather than just focusing on statistical outcomes.
Personal values play a role in how different costs and benefits are weighed.
What is the definition of percentage point change?
The simple numerical difference between two percentages.
Define percent change.
The difference between the initial value and the new value divided by the original value (multiplied by 100).
How does percent change differ from percentage point change?
Percent change is highly sensitive to the original value.
What is conditional probability?
The probability of an event conditional on some other information, written as Pr(C | E).
What is prior belief?
Your belief about something before learning new evidence.
Define posterior belief.
Your belief about something after incorporating new evidence.
What is Bayes’ rule?
A formula for calculating your posterior belief conditional on new evidence and your prior belief.
What is statistical power?
The probability of finding a statistically significant result in the data given that the relationship really exists in the world.
If GDP growth in Country B was 10 percent, what was GDP growth in Country A if it was reported as 20 percent higher?
12 percent.
If GDP growth in Country B was 0.1 percent, what was GDP growth in Country A if it was reported as 20 percent higher?
0.12 percent.
What is one way to avoid misleadingly masking differences in economic growth scenarios?
Provide absolute values or context rather than just percentage change.
What is the percent difference in growth if Country C is at 1 percent and Country D is at 0.1 percent?
900 percent.
What is the percentage point difference in growth if Country C is at 1 percent and Country D is at 0.1 percent?
0.9 percentage points.
True or False: A false positive occurs when a person without a condition tests positive.
True.
What is the false positive rate reported for a specific coronavirus test?
1 percent.
What is the false negative rate reported for a specific coronavirus test?
10 percent.
What is the probability a person gets a positive result given that they really do have coronavirus?
90 percent.
What are the two ways to get a positive test result?
- Correct test result for someone with coronavirus
- False positive for someone without coronavirus
Using Bayes’ rule, how do you express the probability of a job given group membership?
P(Job | Group).
If people in a given job are equally likely to be privileged and unprivileged, what does this imply?
P(Group | Job) is equal for both groups.
What additional information is needed to determine if a member of the privileged group is more likely to be hired?
The probability of being hired given group membership.
If the same number of members from both groups applied for the job, what does this imply?
It may provide sufficient information to determine hiring likelihood.