Guest Lecture: Replication Crisis Flashcards
reproducibility project in psychology
Massive, multi-lab project that randomly selected articles that were published in 2008 in 3 top psychology journals
when did the reproducibility project occur
november 2011-2015
findings of the reproducibility
97% of original studies had significant results (P< 0.05), but only 36% of replications had significant results. 39% of effects were subjectively rated to have replicated
what famous studies failed to replicate?
Ego depletion (willpower is a limited resource), power posing (expect self-reported mood), and precognition (psi)
what causes failures to replicate
fraud, file-drawering, innocent errors, p-hacking
Diederik Stapel
fabricated 58 papers going back to at least 2004
p-hacking
Run at least one unplanned analysis that you might report as though its p-value is valid
false positivity
When we falsely reject the null
what is the false positivity rate in psychology
5%
common ways of p hacking
- Stop data collection if and only if p < .05
- Analyze many measures; report only those that were p < .05.
- Analyze many conditions; report only those that differed at p < .05.
- Use different (combinations of) covariates to try to get p < .05.
- Exclude participants or trials to try to get p < .05.
- Analyze different subgroups to get p < .05.
Transform the data to try to get p < .05
Choose between two dependent variables correlated at .5 results in
9% false-positive rate
Collect 20 observations per cell. If not significant, run 10 more per cell results in
14% false-positive rate
Analyze the data using gender and (gender x independent variable) covariates results in
31% false-positive rate
Collect 3 conditions. Allow yourself to drop a condition if not significant results in
60% false-positive rate
Run two studies. Drop one if it isn’t significant results in
84% false-positive rate