eM2 – Choosing statistics Flashcards
In terms of analysis, what are correlations?
Hypothesis tests to evaluate relationships between variables
What are comparisons?
Hypothesis tests to evaluate differences between groups or populations
What is quantitative data?
Numeric information about quantities - i.e height width etc.
What is qualitative / categorical data?
Information that cannot be measured - i.e. gender, stages of disease etc.
Give two types of quantitative data and an example for each:
Continuous: age Counted (discrete): number of people with hypertension
Give two types of qualitative data and an example for each:
Nominal: Gender Ordinal: Fitness (not fit, quite fit, very fit)
What is the difference between continuous and discrete data?
Continuous can be divided to finer and more precise levels. Discrete data cannot be made more precise.
What is nominal data?
Qualitative data containing individual categories that cannot be put in an implicit rank/order
What is ordinal data?
Categories that have an implicit/natural order.
What is normality in terms of statistical analysis?
Normality is a measure of central tendency and dispersion of data - i.e symmetric distribution with “well behaved tails”
What is meant by left skewness?
Mean to the left of the peak, long tail in negative (decreasing) direction of curve
What is meant by right skewness?
Mean to the right of the peak, long tail in positive (increasing) direction of the curve
What is kurtosis?
The sharpness of a peak of a distribution curve
What two factors do statistical tests rely on?
50% of values above and below mean - symmetrical 2/3rds of data within 1 SD from mean - normal distribution
How to assess normality of data quantitatively?
Shapiro-Wilks test - n>50 Kolmogarov-Smirnof test - n<50
What is descriptive statistics?
A method of categorising large data sets into a format easy to read (tangible).
What is the mean?
μ = ( Σ Xi ) / N
What is the median?
(n+1)/2 -th number in the data set.
What is the mode?
Most frequent data entry.
What is the standard deviation in a data set?
σ = sqrt[Σ ( Xi – μ )^2 / N] A measure of how dispersed the data are from the mean.
What is meant by dependant (paired) data?
When the data being collected is consistantly being collected from the same subject
What is meant by parametric statistics?
When the data from the population are well described by the mean and SD - normally distributed.
What is meant by non-parametric statistics?
When the data is not well described by the mean - non-normally distributed quantitative data. note: non-parametric tests are used for qualitative data.
Parametric, 2 groups, paired
Paired t-test
Parametric, 2 groups, unpaired
Independant t-test
Parametric, 3+ groups, paired
Repeated measures, one way ANOVA
Parametric, 3+ groups, unpaired
one way ANOVA
Non-parametric, 2 groups, paired
Wilcoxon Signed Rank test
Non-parametric, 2 groups, unpaired
Mann-Whitney U test
Non-parametric, 3+ groups, paired
Kruskall-Wallis test
Non-parametric, 3+ groups, unpaired
Friedman test
To test for a linear relationship in a normally distributed population:
Pearson’s Correlation test
To test for a linear relationship in a non-normally distributed population:
Spearman’s Correlation test
Where are the mean median and mode in the skewed curves
mean and median are to the right in the right skewed curve
and the left in the left skewed curved