Week 1- P Hacking and Harking Flashcards

1
Q

What is the main critical thing determining if a manuscript gets published or not?

A

The results (although we should have no control over it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is best for science?

A

Publishing rigorous research regardless if the results support the hypothesis or not (null results are still important!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is best for scientists?

A

Publishing lots of results (care for quantity>quality)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How is Koert van Ittersum and Brian Wansink’s research on food portions and extroversion lacking in credibility?

A

-No error bars
-Small sample size (unrepresentative)
-6 to 12-year-olds will not be alike (unrepresentative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a median-split?

A

Determining the median where below=low and above=high and groups are delegated based off that (mathematically makes no sense as numbers next to each other not alike BUT numbers in same group are even if far apart)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Give 3 examples of poor research

A

1.Wunsink kept analysing food data until significance.

2.Prof Diederik Stape made own data on spreadsheet and had 58 papers retracted.

3.Amy Cuddy claimed power posing (certain poses) would increase brain chemicals and hormones to increase confidence (which just isn’t true)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True or false: The better/prestigious the journal means the less likely someone will replicate the results BUT more likely to retract the paper

A

True (prestigious science journals struggle to attain average reliability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the file drawer problem?

A

Researchers will conduct pre-defined analysis and publish successful findings but file drawer “unsuccessful ones” aka null findings. (usually when p-hacking fails)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What did John et al. (2012) find when determining the number of researchers involved in questionable research practises? (QRP)

A

More worse QRP means fewer self-admission rate e.g., falsifying data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does it indicate if lots of observations are just under the 0.05 significance?

A

Manipulation of data must have occurred somehow because it mathematically just be linear. (as probability is not a false concept)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define P-Hacking

A

The method of manipulating data to achieve significant results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give 6 P-Hacking methods

A

1.Multiple analyses

2.Omitting information (removing certain variables from the analysis)

3.Controlling for variables

4.Analyse part way through then collect more data and repeat until significance is reached

5.Changing the DV

6.Removing outliers (although sometimes they can drive significant results)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false: Nonparametric correlations (non-normal distribution) create more sensible results with outliers

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why can multiple analyses create problems?

A

Different methods can lead to different conclusions despite testing for the same thing. (worse with more complex analyses)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What was the PACE trial?

A

-Compared adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy with specialist medical care for chronic fatigue syndrome (hard to treat) (PACE): a randomised trial (White et al., 2011) suggesting it would exceed SMC

-Cost 5million

-Still used to inform treatment in the UK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What did White et al. (2011) CLAIM they found?

A

Adding CBT and GET onto SMC significantly improves outcomes

17
Q

What did White et al. (2011) originally design?

A

-Originally the trial was designed to explore objective results (6min walking test and Aerobic fitness)

-These did not improve (published 4 years after the main paper)

-Other objective measures e.g. returning to work, did not show improvements (and were delayed in publication)

-Objectively it was ACTUALLY ineffective as a treatment

18
Q

How did White et al. (2011) P-Hack the data?

A

-The PACE trial’s published protocol defined ‘recovery’ as requiring an SF-36 Physical Functioning (SF36-PF) questionnaire score of at least 85 out of 100 originally

-The trial’s entry criteria THEN required a score of 65 or under-recovery, which was taken to indicate that patients’ fatigue was disabling[2].

-However, the cut-off for an acceptable level of physical functioning was changed to 60

-Literally, a participant could have been recruited with an SF-36 score of 65, finished the trial with a score of 60 (i.e. GOT WORSE) yet be described as a positive outcome

19
Q

True or false: Mentions of margins or on the edge of significance is a red flag.

A

True

20
Q

Why is analysing part way through and then collecting more data to stop when significance has reached a problem?

A

Because it massively increases false positive rates (more checkups = more likely around a 25% rate)

21
Q

How is Favardi and Cox’s usage of attentional bias to alcohol in order to reduce consumption done wrong?

A

-A 20-minute lab study won’t fix the strong conditioning
-Hawthorne effect=the awareness of being observed results in people writing fewer measurements of alcohol units down down

22
Q

Why do control groups need to be equivalent?

A

-No removal of active ingredients e.g., doesn’t give a point of comparison or allowance to make a statement about the experimental condition

-For example, waiting list controls (e.g., waiting for nicotine) are NOT a control condition (just use a non-active patch to make it a CC)

23
Q

What’s the false positive rate of p<.05?

A

30% (Colquhoun, 2014)

24
Q

How can we redefine significance?

A

-Benjamin et al (2017) argue that new effects should be considered sig only if p<.005.

-This corresponds to a Bayes factor of 14-26 which represents ‘substantial’ to ‘strong’ evidence according to Bayes classifications.

-As a result, it reduces the false-positive rate to ‘reasonable levels’ (5%).

25
Q

According to Lakens et al (2017), why does lowering the p-value to .005 NOT improve replicability?

A

-P<.005 won’t reduce p-hacking (just make it harder)

-Increased sample size needed to be powered (favours established research establishments)

-Would reduce number of replication studies

-Cost more money (bias in favour of established scientists?)

-May increase false negatives

-Instead argues we should define alpha levels when designing study

26
Q

True or false: Computerised tests have poor reliability

A

True up to the point that results should be discarded

27
Q

What’s Preregistration?

A

-Registering your intentions for analysis in advance (means you get caught out if you change it so avoid temptation)

-Justify sample sizes (ideally from power analyses)

28
Q

How do we preregister?

A

1.What is the main question being asked or hypothesis tested in this study

2.Describe the key dependent variable(s) specifying how they will be measured

3.How many conditions will participants be assigned to?

4.Specify exactly which analyses you will conduct

5.Any secondary analyses? (It’s the use of existing research data to find an answer to a question that was different from the original work)

6.How many observations will be collected or what will be the sample size

7.Anything else you would like to pre-register?

8.Has any data already been collected

29
Q

What is open data?

A

-Shared data not keeping it as your own

-Add it to the open science framework (preregistration too and means you won’t put purposely manipulated wrong data)

-Must be included in the ethics that you can share data

30
Q

What is Clarke’s third law (updated by Gelman)?

A

“Any sufficiently crappy research is indistinguishable from fraud”