Resampling and Open Science Flashcards
What are resampling methods?
Sets confidence intervals and critical values of tests
Non-parametric
Relatively assumption free
Computationally demanding
Bootstrapping vs Permutation
Both involve computing simulations
Bootstrapping samples with replacement
Permutation samples without replacement
Simulating Null Hypothesis Distribution Permutation Tests
common statistical question usually comparing two groups
Null Hypothesis
States that the group means are equivalent
Alternative Hypothesis
states that the groups means are not equivalent
P-value
Indicates probability of obtaining a difference in the mean of the two samples at least as extreme as observed at random, if the two samples did come from the same population
Null Hypothesis Testing using Parametric Methods
Simulate or select theoretical null hypothesis sampling distribution
Determine where our observed test statistic lies within this distribution and the probability of it being observed if null hypothesis was true
Null Hypothesis Testing using Permutation Tests
- calculate the real difference between the means of two groups
- simulate or select theoretical null hypothesis sampling distribution
Simulating H0: Random Shufflying of Observations
Pool sample together then draw new groups from pooled sample
Repeat process many times ensuring it is not due to chance
Build a mean that should be true under the null hypothesis
Bootstrapping
Resampling method for assessing statistical accuracy of an estimate
Own sample and treat it as the entire population assessing it better
does not work for small samples
Typically used to estimate quantities associated with the sampling distribution of estimates
Original sample is resampled then drawn from which may be repeated multiple times creating a bootstrapping sample
Basic Ideas of Bootstrapping
Treat particular sample as the entire population
Repeatedly sample with replacement to generate samples
Analyse to get estimate
How Science Should Work?
Thought to be a reliable way to answer questions about the world
Start with hypothesis
Collect data
Do statistics to test null hypothesis
Make conclusion based on the data
Reproducibility Crisis in Science
Findings do not replicate
Most significant effects should replicate
97% of 100 papers reported significant findings but only 37% were significant in the replication study
Data should reproduce as repeat experiments find significant effects replicating - rare
Publication Bias
Not all research is equally publishable - editorial bias, incentivised, increased likelihood of false positives published
Wrong incentive structures - academic success tied to ‘significant’ results, publish or perish
Distorts meta-analyses - bad for estimating effect sizes
Not publishing negative results skew meta-analyese
P-Hacking
Actively searching for ‘something significant’ in data
Cherry-picking - experiments, subjects, stopping rules
Analysis of degrees of freedom
Variant of multiple comparisons problem
Focus on data giving significant findings and ignore those that do not inflate the chance of it occurring
Multiple comparisons unless controlled inflate probability