Sampling - ReSampling Methods Flashcards
what are non-parametric statistics?
Nonparametric statistics refer to a statistical method wherein the data is not required to fit a normal distribution. Nonparametric statistics uses data that is often ordinal, meaning it does not rely on numbers, but rather a ranking or order of sorts.
what is an example of non-para alternatives to parametric tests?
Mann-Whitney U / Spearman
And what do these tests do to the data in order to work?
They rank continuous data — we can always rank continuous data (become uniform distribution – might not be normal – but works well)
what is the nonparametric alternative to the t test?
Mann-Whitney U or Log_Rank
what is the nonparametric alternative to one way within subjects paired t test?
Wilcoxen signed ranks
what is the nonparametric alternative to the ANOVA?
Kruskal-Wallis
what is the nonparametric alternative to linear regression?
Spearman-Rank / Logistic Regression
what are the advantages of non-parametric tests?
Ð Far fewer assumptions and less restrictive – apply more widely
Ð If the assumptions have been violated – they are more powerful.
what are the dis-advantages of non-parametric tests?
Ð They test more complex null hypotheses (although personally I feel this is an advantage)
Ð Test is less powerful if parametric violations have not been violated
when would we conduct non-para?
Ð Small sample
Ð Non-parametric
Ð Weird distribution
What are Resampling – Randomisation tests?
Instead of working out theoretical distribution (normal stats)– we avoid that in non-para resampling
randomisation tests empirically generate a population distribution using computer power to numerically estimate (randomly and repeatedly) the sampling distribution for the data.
So, we generate from our original data a sampling distribution- running many pseudo experiments based on this generated sampling distribution. IF, under the null, you ran infinitely many experiments, generating many like the one you got, how would the statistic be distributed?
what is the sampling distribution ?
Sampling distribution: Asampling distributionis a probabilitydistribution of a statistic obtained through a large number of samples drawn from a specific population. The sampling distributionof a given population is the distributionof frequencies of a range of different outcomes that could possibly occur for a statistic of a population.
what is the difference between randomization tests and permutation tests?
Randomisation test = randomly sampled from a subset
Permutation test = every possible permutation sampled from
how do randomization tests and permutation tests draw from the sample in comparison to bootstrapping?
without replacement
bootstrapping replaces data after being sampled
why does bootstrapping replace data after being sampled?
so the effect is preserved and may be repeated
what is the key to the randomisation and sampling without replacement process?
Each resample is obtained by randomising the actual data without replacement - as if H0 were true.
after generating the sampling distribution, we…?
THEN use the artificially generated sampling distribution to test the statistic.
how many times to resample?
generally, 5,000 - 10,000 times – if the null hypothesis is true, there will be no difference I any conceivable numerically generated combination.