Resampling and Open Science Flashcards
What are resampling methods?
Sets confidence intervals and critical values of tests
Non-parametric
Relatively assumption free
Computationally demanding
Bootstrapping vs Permutation
Both involve computing simulations
Bootstrapping samples with replacement
Permutation samples without replacement
Simulating Null Hypothesis Distribution Permutation Tests
common statistical question usually comparing two groups
Null Hypothesis
States that the group means are equivalent
Alternative Hypothesis
states that the groups means are not equivalent
P-value
Indicates probability of obtaining a difference in the mean of the two samples at least as extreme as observed at random, if the two samples did come from the same population
Null Hypothesis Testing using Parametric Methods
Simulate or select theoretical null hypothesis sampling distribution
Determine where our observed test statistic lies within this distribution and the probability of it being observed if null hypothesis was true
Null Hypothesis Testing using Permutation Tests
- calculate the real difference between the means of two groups
- simulate or select theoretical null hypothesis sampling distribution
Simulating H0: Random Shufflying of Observations
Pool sample together then draw new groups from pooled sample
Repeat process many times ensuring it is not due to chance
Build a mean that should be true under the null hypothesis
Bootstrapping
Resampling method for assessing statistical accuracy of an estimate
Own sample and treat it as the entire population assessing it better
does not work for small samples
Typically used to estimate quantities associated with the sampling distribution of estimates
Original sample is resampled then drawn from which may be repeated multiple times creating a bootstrapping sample
Basic Ideas of Bootstrapping
Treat particular sample as the entire population
Repeatedly sample with replacement to generate samples
Analyse to get estimate
How Science Should Work?
Thought to be a reliable way to answer questions about the world
Start with hypothesis
Collect data
Do statistics to test null hypothesis
Make conclusion based on the data
Reproducibility Crisis in Science
Findings do not replicate
Most significant effects should replicate
97% of 100 papers reported significant findings but only 37% were significant in the replication study
Data should reproduce as repeat experiments find significant effects replicating - rare
Publication Bias
Not all research is equally publishable - editorial bias, incentivised, increased likelihood of false positives published
Wrong incentive structures - academic success tied to ‘significant’ results, publish or perish
Distorts meta-analyses - bad for estimating effect sizes
Not publishing negative results skew meta-analyese
P-Hacking
Actively searching for ‘something significant’ in data
Cherry-picking - experiments, subjects, stopping rules
Analysis of degrees of freedom
Variant of multiple comparisons problem
Focus on data giving significant findings and ignore those that do not inflate the chance of it occurring
Multiple comparisons unless controlled inflate probability
HARKing
Hypothesis after results are known
Related to p-hacking
Hides the reality of multiple comparison problem
Apophenia
Tendency to see patterns in random data
Confirmation Bias
Tendency to focus on evidence that is in line with our expectations or favoured explanation
Hindsight Bias
Tendency to see an event as having been predictable only after it has occurred
Why do Findings Fail to Replicate?
When low-powered studies show significant effects, these will be overestimates
Open Science - Open Hypothesis Testing and Analysis Decision Making
Pre-registration - publicly commit to your hypothesis and analysis pipeline before conducting study - treats p-hacking and HARKing
Registered report - coupled to publishing can also treat public bias
Open Science - Open Analysis Tools and Data
Make all materials open access so other researchers can double check conclusions
Treats honest mistakes
Open Science - Open Evaluation: Peer Review Published with Article
More information is always better
Open Science - Open Access
Make the research outcomes (published articles) accessible to everyone