statistical tests Flashcards
T-tests
quantitative method- confidence intervals and hypothesis test for mean difference between 2 groups
paired T-test
confidence intervals an hypothesis test for mean difference between two paired groups
e. g. participants paired on same criteria
e. g. measurements taken before and after intervention
unpaired t-test
confidence intervals and hypothesis test for mean diff between two independent groups
assumptions of unpaired t-tests
- normal distribution
- SD is similar in two groups
- participants are independent between groups
Analysis of variance
- A method for hypothesis testing
- Anova
- Unpaired groups
- Provides a global p-value comparing the mean across all groups
repeated measures analysis of variance
hypothesis test for comparing the mean across three or more paired groups
assumptions for repeated measures analysis of variance
Ð the difference scores between any two groups are Normally distributed in the population (or sample size is large and difference scores not too skewed)
Ð the standard deviation of the difference scores when comparing any two groups should be similar (“sphericity” assumption)
non-parametric methods for comparing groups (4)
Mann Whitney test, Wilcoxon signed ranks, Kruskal-wallis, Friedman test
what do non-parametric methods do
compare entire distributions and not means between groups- used when data is not normally distributed
non-parametric methods summarise their groups using
medians and interquartile ranges
which non-parametric test could be used when comparing two groups
Mann Whitney test
which non-parametric test cold be use to compare paired groups
Wilcoxon signed ranks test
which non-parametric test could be used to compare three or more independent groups
Kruskal wallis test
which non-parametric test could be used to compare three or more paired groups
Friedman test
parametric tests
T-test and analysis of variance
parametric tests make assumptions based on
they make distribution assumptions e.g. why data has to be normally distributed
what methods are used to summarise the groups in parametric tests
SD and means
when should non-parametric tests be used
when the assumptions that underlie parametric methods for independent groups don’t hold: skewed, small sample, SD differ markedly
advantage of non-parametric methods
o Always valid for quantitative data (even skewed data in small samples and ordinal data)
o Where the assumptions of parametric methods are met non-parametric methods often provide similar p-values
disadvantage son non-parametric
o They do not make direct inferences about a parameter, such as the mean difference
o Provide no confidence intervals, only p-values
o Based only on the analysis of ranks, not actual scores
o When assumptions for parametric methods hold, non- parametric methods can be less sensitive
box and whisker plot are used to show
- median
- lower quartile
- upper quartile
outliers
extrem observations with every low or very high values
positively skewed distribution on a box and whisker plot are shown by
top part being thicker than bottom part- whisker is slightly longer than the bottom whisker
correlation
is the association between 2 variables- the extent to which higher values of one variables occurs in combination with higher values on other variables
how is correlation presented graphically
scatterplots
how is correlation presented numerically
correlation coefficients
persons correlation coefficients
quantifies the strength of association between two quantitative variables which have a linear relationship. Between 0 and 1.
Spearman
quantifies non-linear and monotonic reltionships
R2
is the proportion of the variation in one variable that is explain day another variable
R2 is known as
the coefficient of determination (RxR)
linear regression
a mathematical equation which describes the linear relationship between a quantitative outcome and quantitative predictor.
in linear regression he predictors is often
assumed to be a potential cause of the otucome
linear regression equation
outcome= a + b x predictor
a and b are recession coefficients_
a is called the
constant or intercept (mean value of outcome when predictor is zero)
b is called the
slope
the predicted increase i the outcome for each one unit increase in the predictor
diff between pearl and repression slope
¥ Pearson correlation coefficient (r) quantifies strength of association
¥ regression slope (b) describes the relationship and can be used to predict the outcome variable score based on the predictor variable score
assumptions for regression (3)
relationship between the outcome and quantitative predictor is linear
2) residuals are normally distributed
3) constant variance (homoscedasticity)
homoscedasticity
the variability in the residuals is the same across the predicted (fitted) values
spear correlation
can take value between -1 and 1. Used for non-linear associations provided they are monotonic
monotonic
means that either the relationship in the scatterplot is never positive or never negative
Chi-squared
is a parametric method, tests quantify evidence against the null hypothesis
based on the discrepancy between numbers observed in each cell of 2 by 2 table and the numbers expected if null hypothesis is true
greater the discrepancy the smaller the p-value and the greater the evidence against the null hypothesis
assumptions of Chi-squared
total sample size of at least 40 or if the sample size is between 20 and 39 the expected value in each cell is 5
if the assumptions of chi-squared cannot be met…
Fishers exact test is used
Fishers exact test
is the non-parametric alternative to the chi squared test be used for 2 by 2 contingency tables
when is fishers used
Ð fewer than 20 participants or
Ð between 20 and 39 participants and the expected value in at least one cell is less than 5
Ð fewer than 20 participants or
Ð between 20 and 39 participants and the expected value in at least one cell is less than 5