Week 1- P Hacking and Harking Flashcards
What is the main critical thing determining if a manuscript gets published or not?
The results (although we should have no control over it)
What is best for science?
Publishing rigorous research regardless if the results support the hypothesis or not (null results are still important!)
What is best for scientists?
Publishing lots of results (care for quantity>quality)
How is Koert van Ittersum and Brian Wansink’s research on food portions and extroversion lacking in credibility?
-No error bars
-Small sample size (unrepresentative)
-6 to 12-year-olds will not be alike (unrepresentative)
What is a median-split?
Determining the median where below=low and above=high and groups are delegated based off that (mathematically makes no sense as numbers next to each other not alike BUT numbers in same group are even if far apart)
Give 3 examples of poor research
1.Wunsink kept analysing food data until significance.
2.Prof Diederik Stape made own data on spreadsheet and had 58 papers retracted.
3.Amy Cuddy claimed power posing (certain poses) would increase brain chemicals and hormones to increase confidence (which just isn’t true)
True or false: The better/prestigious the journal means the less likely someone will replicate the results BUT more likely to retract the paper
True (prestigious science journals struggle to attain average reliability)
What is the file drawer problem?
Researchers will conduct pre-defined analysis and publish successful findings but file drawer “unsuccessful ones” aka null findings. (usually when p-hacking fails)
What did John et al. (2012) find when determining the number of researchers involved in questionable research practises? (QRP)
More worse QRP means fewer self-admission rate e.g., falsifying data
What does it indicate if lots of observations are just under the 0.05 significance?
Manipulation of data must have occurred somehow because it mathematically just be linear. (as probability is not a false concept)
Define P-Hacking
The method of manipulating data to achieve significant results
Give 6 P-Hacking methods
1.Multiple analyses
2.Omitting information (removing certain variables from the analysis)
3.Controlling for variables
4.Analyse part way through then collect more data and repeat until significance is reached
5.Changing the DV
6.Removing outliers (although sometimes they can drive significant results)
True or false: Nonparametric correlations (non-normal distribution) create more sensible results with outliers
True
Why can multiple analyses create problems?
Different methods can lead to different conclusions despite testing for the same thing. (worse with more complex analyses)
What was the PACE trial?
-Compared adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy with specialist medical care for chronic fatigue syndrome (hard to treat) (PACE): a randomised trial (White et al., 2011) suggesting it would exceed SMC
-Cost 5million
-Still used to inform treatment in the UK