Choosing the correct test Flashcards
learn this table of overview of common statistical tests
what does outcome variable mean?
the thing you’re comparing between diff groups
what does the type of statistical test we decide to use depending on?
- is the outcome continuous or binary or a time?
- the relationship of groups comparing: independent or correlated?
- consider the assumptions that need to be made
when we have a continous outcome how do we translate this into a statistical research question?
what is the outcome variable in this case?
maths score
the type of outcome variable in this case is maths score. what type of variable is this?
contunuous
the type of outcome variable in this case is maths score, which is a continuous variable
is it normally distributed?
yes - we will assume it is for the purpose of this demo example
the type of outcome variable in this case is maths score, which is a continuous variable
it is normally distributed
are the observations correlated?
no - as they are randomly selected
the type of outcome variable in this case is maths score, which is a continuous variable
it is normally distributed
the observations are correlated
are groups being compared, if so, how many?
yes, 2
the type of outcome variable in this case is maths score, which is a continuous variable
it is normally distributed
the observations are independant
2 groups are being compared.
Therefore, which test should we use?
T-test
Example 1: two-sample T-test
what is our first step?
define your hypothesis (null and alternative)
Therefore, what is our hypothesis for this question?
Example 1: two-sample T-test
what is our second step?
seeing the observed difference
Example 1: two-sample T-test
what is our third step?
Example 1: two-sample T-test
what is our fourth step?
Example 1: two-sample T-test
what is our conclusion - do we reject the null hypothesis or not? and why?
Example 1: two-sample T-test
We also look at something called a confidence interval which will be a range from a negative value to a positive value.
That is to say if a confidence interval covers 0, the result is or is not significant?
is not significant
Example 1: two-sample T-test
if we had a confidence interval set to one side of 0, so it does not cover 0, what does this say about the significance of the results?
that the result is significant as the difference cannot be 0
Example 1: two-sample T-test
however, in this case looking at the confidence intervals and whether or not they cover 0, what does this suggest about the significance of the results?
that it is not significant
Example 2:
For this scenario we are looking at
statistical question: is there a difference in remineralisation effect between the two materials?
what is the outcome variable?
re-mineralisation?
Example 2:
For this scenario we are looking at
statistical question: is there a difference in remineralisation effect between the two materials?
type of variable is it?
continuous
Example 2:
For this scenario we are looking at
statistical question: is there a difference in remineralisation effect between the two materials?
is it normally distributed?
yes - for the purpose of this example we assume it is
Example 2:
For this scenario we are looking at
statistical question: is there a difference in remineralisation effect between the two materials?
are the observations correlated?
yes - same patient/ same mouth
Example 2:
For this scenario we are looking at
statistical question: is there a difference in remineralisation effect between the two materials?
how many sites are being compared?
2
Example 2:
in this case what test should we use?
paired t-test
compare paired t -test to t test
we have an alternatives column that says, alternatives it the normality assumption is violated (and small n) what does this mean?
that means when we collect our data, we assume that the data we collected is normally distributed. So then we can use paired t-test or t-test. But if the data we collect is not normally distributed, or if the sample size is extremely small (less than 10), it is better to go for non-parametric test.
non-parametric means we dont calc mean score for groups anymore but we use median.
the mean is a parameter from the data we collected, therefore if distribution not normal then using mean score may be misleading so better to use the median.
the median is not a parameter it is just a middle value
what do we compare instead of comparing the mean in a non-parametric stats test?
the median
what is the non-parametric version of a paired t-test?
Wilcoxon sign rank test
what is the non-parametric version of a t-test?
Mann-Whitney U test
every parametric test has a corresponding …..-parametric test alternative
non
Example 3:
what stats test was used here to compare mean micronutrient intake from lunch?
ANOVA
Anova is the analysis of what?
analysis of variance
(don’t worry about the algorithm)
What type of distribution is ANOVA used for?
normally distributed variables
what other test is ANOVA just an extension of?
what is the null hypothesis of anova?
comparing more then 3 groups
so null = there is no difference between the 3 groups
what is the alternative hypothesis of anova?
what type of test do ANOVA use?
F-test
what does a statistically significant ANOVA (F-test) tell u about the difference between groups?
does ANOVA tell u which groups differ?
no - just states that they do differ
instead of using anova, why cant we just do 2 pai-wise t-tests?
what correction do we do for multiple comparisons?
important to understand what it is and how it is done
what is the name of the easiest way to carry out a correction for multiple comparisons?
Bonferroni correction
if normality assumtion of Anova is violated, we use non-parametric method.
what is the non-parametric version of anova called?
Kruskal-Wallis test
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
what is the outcome variable?
bad dental health (yes/no)
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
what type of variable is this?
binary
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
are the observations correlated?
no
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
are groups being compared, if so how many?
2
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
are any of the counts smaller than 5? (similar to as with coninuous data, 5 is the cut-off point this time round, to use parametric version)
no, smallest is 11 (current smoker in good dental health)
Tests for Binary/ categorical outcome
Statistical question: does the proportion of people in bad dental health differ in smokers and non-smokers?
which test should we use?
chi-squared
chi squared allows u to compare proportions between how many groups?
2 or more groups - so any number of groups
so based on this interpretation and p-value, what is our conclusion?
what is the alternative to chi-squared if we have less than 5 cells?
Fisher’s exact test