CHAPTER 7 Over-Comparing, Under-Reporting Flashcards

1
Q

What is the consequence of analysts making numerous comparisons but only reporting statistically significant ones?

A

There will be lots of false positive results and over-estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is p-hacking?

A

A form of nefarious researcher behavior leading to false positives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is p-screening?

A

A situation where entirely honest researchers also contribute to false positives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What tools do analysts and consumers have to reduce misleading results?

A

There are some tools at their disposal, though no easy solution exists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What creature was known for predicting the outcomes of soccer matches?

A

Paul the Octopus.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How did Paul the Octopus predict match outcomes?

A

By choosing between two boxes of food marked with the flags of competing countries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How many predictions did Paul make, and how many did he get right?

A

Paul made 14 predictions and was correct in 12 of them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the null hypothesis in the context of Paul’s predictions?

A

That Paul was picking in a completely random fashion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can we calculate the probability of Paul guessing correctly?

A

By calculating the likelihood of getting exactly 12, 13, or 14 correct predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the probability of Paul getting at least 12 correct predictions if he was guessing randomly?

A

Approximately 1 in 155.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does a p-value represent in hypothesis testing?

A

The probability of observing an outcome at least as extreme as the one observed if the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What was the adjusted probability of Paul getting 11 or more predictions right if he was predisposed to pick Germany?

A

About 0.03 or 1 in 33.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why should we be skeptical of Paul’s predictive abilities?

A

Because he was primarily predicting games involving Germany, which he favored.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the significance of having multiple octopuses making predictions?

A

It raises the likelihood that at least one would achieve a record similar to Paul’s by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the probability that at least one of ten octopuses generates a p-value as good as Paul’s?

A

About 1 in 4.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Name some other animals that forecasted soccer match winners around the same time as Paul.

A
  • Leon the Porcupine
  • Petty the Hippopotamus
  • Anton the Tamarin
  • Mani the Parakeet
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the implication of many animals making predictions?

A

Many could be celebrated for their predictions, even if their success was due to chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Fill in the blank: The term for the product of n and every positive whole number less than n is called _______.

A

factorial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the issue with only reporting statistically significant results?

A

It leads to publication bias, where the true effects are systematically overestimated

This occurs because only the results that reject the null hypothesis are published.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does the term ‘publication bias’ refer to?

A

The phenomenon where only statistically significant results are reported, leading to a distorted understanding of research findings

This bias can occur even when all studies are well-designed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is p-hacking?

A

The practice of manipulating data or statistical tests until a desired p-value is achieved

This can involve tweaking experiments or trying different statistical models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is p-screening?

A

The practice of not publishing studies with p-values above a certain threshold, leading to under-reporting of null results

This can occur even when researchers act honestly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How does over-comparing contribute to publication bias?

A

It increases the likelihood of finding statistically significant results purely by chance, which are then reported while null results are ignored

This can happen when numerous hypotheses are tested without proper correction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

True or False: All scientific results are published regardless of their significance.

A

False

Statistically insignificant results are often not published, leading to biased literature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the file drawer problem?

A

The tendency for studies with null results to remain unpublished, leading to a lack of awareness about such findings

This contributes to publication bias in scientific literature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What happens to the true estimand when only statistically significant results are reported?

A

The reported estimates systematically overestimate the true estimand

This occurs even if the original estimates are unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the impact of noise on scientific estimates?

A

Noise can cause estimates to differ from the true quantity of interest, affecting the validity of conclusions drawn from them

This is particularly relevant in studies with small sample sizes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Fill in the blank: The act of testing many different outcomes in a study can lead to _______.

A

over-comparing

This increases the chances of finding a statistically significant result by chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What should researchers do to avoid p-hacking?

A

Adhere to pre-registered study designs and avoid manipulating data post-hoc

Transparency in research practices is crucial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the consequence of publication bias on scientific knowledge?

A

It undermines the reliability of scientific consensus and leads to the belief that many accepted facts may be false

This has caused concern among scientists about the integrity of their fields.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What did Daryl Bem’s 2010 study claim?

A

It claimed that human beings possess extrasensory perception (ESP)

The study was controversial and sparked debates about the validity of such claims.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Why is it difficult to accumulate knowledge in a field affected by publication bias?

A

Because over-comparing and under-reporting distort the average of published estimates, making it hard to get close to the true estimand

This can lead to a misrepresentation of the consensus in scientific literature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the relationship between noise and the true estimand?

A

Noise can cause estimates to deviate from the true estimand, complicating the interpretation of results

This can lead to erroneous conclusions if not accounted for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

True or False: Publication bias affects only the results of individual studies.

A

False

It affects the overall distribution of estimates in scientific literature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What does the term ‘estimand’ refer to?

A

The true quantity of interest that a study aims to estimate

Understanding the estimand is crucial for interpreting research findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the significance of Daryl Bem’s 2010 study?

A

It claimed that human beings have extrasensory perception (ESP) and reported statistically significant evidence that subjects could predict the location of hidden objects better than chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is publication bias?

A

It occurs when studies with statistically significant results are more likely to be published than those without significant results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Define p-hacking.

A

The practice of manipulating data or statistical analyses to obtain a statistically significant result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What did Bem’s study find specifically about the type of objects involved in ESP?

A

Evidence of ESP was only found when the objects were erotic in nature.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What was the initial response of the psychological community to Bem’s findings?

A

The community remained skeptical and several follow-up studies failed to replicate the findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

True or False: The Journal of Personality and Social Psychology initially published replication studies of Bem’s claim.

42
Q

What was the average estimated effect of get-out-the-vote interventions according to published studies?

A

About a 3.5 percentage point increase in voter turnout.

43
Q

What was the actual average effect of get-out-the-vote interventions found by Green, McGrath, and Aronow?

A

Half a percentage point.

44
Q

What does the distribution of p-values help to assess?

A

It helps to diagnose whether p-hacking has occurred in a body of literature.

45
Q

In which case would we expect a uniform distribution of p-values?

A

When there is no real relationship in the world and no p-hacking.

46
Q

What does it suggest if a literature shows more low p-values than high p-values?

A

It suggests that the literature is detecting a real relationship in the world.

47
Q

What are some signs of p-hacking identified by Simonsohn, Nelson, and Simmons?

A
  • Excluded
  • Transformed
48
Q

What is one proposed solution to reduce publication bias?

A

Reduce the significance threshold for p-values from .05 to a lower value.

49
Q

What is a potential downside to lowering the significance threshold?

A

It could increase incentives for p-hacking by making statistically significant results rarer and more valuable.

50
Q

What does a significant threshold of .005 mean for false positives?

A

It means fewer false positives but at the cost of more false negatives.

51
Q

What is the consequence of lowering the significance threshold?

A

It might increase incentives for p-hacking and make statistically significant results rarer and more valuable

Lowering the threshold could lead to complacency in critical analysis.

52
Q

What is the trade-off when using a significance threshold of .005?

A

It results in fewer false positives at the cost of more false negatives

False positives are rejecting the null hypothesis when it is true, while false negatives are failing to reject the null hypothesis when it is false.

53
Q

How can p-values be adjusted?

A

By correcting for the number of tests run

This helps in better assessing the state of the evidence.

54
Q

What is a limitation of simple p-value corrections?

A

They only work if the tests are truly independent

Related tests may require more complex adjustments.

55
Q

What does the threshold of .05 represent in statistical testing?

A

An arbitrary number for determining statistical significance

It may not reflect the substantive importance of effects.

56
Q

What is pre-registration in research?

A

A commitment to test specific hypotheses before seeing the data

It helps prevent over-comparing and under-reporting.

57
Q

What was the NHLBI’s requirement for clinical trials?

A

Developers must pre-register the goals of the drug or supplement

Success is only declared if there is a statistically significant effect on the pre-registered outcome.

58
Q

What was the success rate of clinical trials after pre-registration according to Kaplan and Irvin’s study?

A

It dropped from 57 percent to 8 percent

This indicates many prior successes were likely due to over-comparing.

59
Q

What is replication in research?

A

Reassessing an estimated effect using new, independently generated data

It helps verify the genuineness of findings.

60
Q

How does the probability of finding a false positive change with replication?

A

It decreases with each independent replication

Multiple replications reduce the likelihood of spurious conclusions.

61
Q

What is the significance of testing additional hypotheses related to a finding?

A

It helps assess the underlying mechanisms and validity of the original claim

This method can provide insights even when direct replication isn’t possible.

62
Q

What should raise concerns about a study’s findings?

A

If the study would not have been published had the opposite result been found

This indicates potential issues with over-comparing and under-reporting.

63
Q

What is the power pose hypothesis?

A

Adopting a power pose influences attitudes and behaviors

The study’s underlying science is disputed and replication attempts have failed.

64
Q

What broader issue does the story of Paul the Octopus illustrate?

A

The challenges of over-comparing and under-reporting extend beyond science

This issue can affect everyday decision-making and consumer behavior.

65
Q

What is p-hacking?

A

Searching over lots of different ways to run an experiment, make a comparison, or specify a statistical model until you find one that yields a statistically significant result and then only reporting that one.

66
Q

What is publication bias?

A

The phenomenon whereby published results are systematically over-estimates because there is a bias toward publishing statistically significant results.

67
Q

What is p-screening?

A

A social process whereby a community of researchers, through its publication standards, screens out studies with p-values above some threshold, giving rise to publication bias.

68
Q

How can over-comparing and under-reporting affect scientific findings?

A

They create deep challenges for the scientific community, leading to potentially misleading interpretations of data.

69
Q

What does the efficient-market hypothesis suggest?

A

No fund or investment strategy should be able to systematically beat the market average over the long run.

70
Q

What is the probability of an investor beating the market 15 years in a row by chance?

A

About 1 in 30,000.

71
Q

True or False: Bill Miller’s success as a fund manager can be solely attributed to his investment strategies.

72
Q

What should you consider when assessing the validity of a claim about an investment strategy’s success?

A

Whether the comparison made is the natural one or if it was chosen to make the strategy look better.

73
Q

Fill in the blank: One of the main reasons to be cautious about superstars in finance is that their success may be due to _______.

A

good luck.

74
Q

What happened to Bill Miller’s fund after his streak of success?

A

His fund lost 55 percent of its value during the 2008 financial crisis and continued to trail the market for several more years.

75
Q

What is one practice that can help mitigate the problem of over-comparing and under-reporting?

A

Thinking clearly about the naturalness of comparisons made.

76
Q

What is the significance of pre-registration in a study?

A

It helps in assessing the confidence in the findings by outlining expected outcomes beforehand.

77
Q

What might indicate that a study’s results are unreliable?

A

If the primary outcome of interest was revised during the study.

78
Q

What are some potential measures of prior exposure to Trump in the 2016 U.S. presidential election?

A

Watching The Apprentice, watching Home Alone 2, or both.

79
Q

Why should investors be skeptical of claims made by successful fund managers?

A

Due to the sheer number of traders and funds, exceptional track records may arise by chance.

80
Q

What is a common outcome of publication bias in scientific literature?

A

Misleadingly high estimates of effect sizes due to the preference for statistically significant findings.

81
Q

What does the term ‘superstars’ refer to in the context of finance?

A

Individuals who achieve remarkable success, often leading to misleading inferences about their abilities.

82
Q

Fill in the blank: The tendency of scientific estimates to shrink over time is explained by _______.

A

reversion to the mean.

83
Q

What is the primary focus of the analysis mentioned in the text?

A

Finding statistically significant relationships suggesting that prior exposure to Trump corresponded to political behaviors in the 2016 presidential election.

84
Q

What types of prior Trump exposure can be tested?

A
  • Having seen The Apprentice
  • Having seen Home Alone 2
  • Both shows
85
Q

What political behaviors can be measured as outcomes?

A
  • Support for Trump
  • Support for Hillary Clinton
  • Voter turnout in 2016
86
Q

What demographic subgroups can be analyzed for voter behavior?

A
  • Women
  • Blacks
  • Southerners
  • Rich
  • Young
87
Q

What was revealed about the variables related to the respondents’ exposure to Trump?

A

The variables were made up and generated completely at random.

88
Q

What should be considered when interpreting the relationships found between variables and political behavior?

A

Consider if the relationships would hold with new data from another set of respondents.

89
Q

What is a concern regarding academic studies as mentioned in the text?

A

Problems of over-comparing and under-reporting.

90
Q

What are some suggested actions for authors to address concerns of over-comparing?

A
  • Disclose additional information
  • Conduct additional analyses
91
Q

What is p-hacking?

A

Undisclosed flexibility in data collection and analysis that allows presenting anything as significant.

92
Q

Who coined the term p-hacking?

A

Joseph Simmons, Leif Nelson, and Uri Simonsohn.

93
Q

What is the significance of the 2016 Cooperative Congressional Election Study in this context?

A

It provided real survey data for analysis.

94
Q

What do the authors suggest regarding statistical significance?

A

Lowering the threshold for statistical significance.

95
Q

What is the purpose of pre-registration in research?

A

To mitigate biases in data analysis and reporting.

96
Q

What was the main finding of the power posing studies?

A

Initial claims of effects on hormones and risk tolerance were not replicated.

97
Q

What is a potential outcome of observing irrelevant events on voter behavior?

A

Voters’ evaluations of government performance may be affected.

98
Q

True or False: The authors initially provided accurate data about respondents’ media exposure.

99
Q

Fill in the blank: The probability that one investor gets 15 years in a row right is _______.

A

[specific probability value not provided in text]

100
Q

What is the significance of the study by Kaplan and Irvin published in 2015?

A

It discussed the likelihood of null effects in large NHLBI clinical trials over time.

101
Q

What is a common issue in observational research mentioned in the text?

A

False-positive results.