Lecture 4: Analysis of Continuous and Categorical Variables Flashcards
Descriptive vs Inferential Statistics
descriptive: describing the central tendency and dispersion of data
inferential: use sample data to draw conclusions about the population that the sample is mean to represent (sampling will naturally involve error)
o Estimate parameters and test hypotheses to make inferences about the population
o Compare means and evaluate relationships
o Test statistics, p-values, confidence intervals
Which test do I apply if I have 2 related samples and parametric data?
paired t-test
Which test do I apply if I have 2 related samples and non- parametric data?
Wilcoxon test
Which test do I apply if I have 2 independent samples and parametric data?
Independent t-test
Which test do I apply if I have 2 independent samples and non-parametric data?
Mann-Whitney U test
Which test do I apply if I have 3 or more groups and parametric data?
ANOVA
Which test do I apply if I have 3 or more groups and non-parametric data?
Kruskal-Wallis test
What is student t-test?
Used to compare means between two groups
o Related groups: paired t-test (e.g. pre- and post- study measures on the same participants)
o Independent groups: unpaired/independent t-test
What is the null hypothesis in student t-test?
the means of the groups are not statistically different
What is the degrees of freedom in student t-test?
amount of information provided by the data that can be used to estimate population parameters and
variability of the estimates
o df = n - # of estimated parameters
o As df increase, t-distribution more closely resembles a normal distribution
• E.g. One sample independent t-test to estimate the population mean
o Estimates the standard deviation about the mean o Uses a t-distribution with df = n-1
o Df=n–1 for paired t-test as well
• E.g. Two sample independent t-test to compare two means
o Uses a t-distribution with df = n1 + n2 – 2
What are the assumptions of T-test?
• Samples are independent
• Variable is normally distributed
• Variance homogeneity variance within each group is equal
o Levene’s test for equality of variances in SPSS (automatically conducted)
o Informs you whether to use results for pooled or unpooled variance
• T-tests fairly robust even if assumptions are not perfectly met
What is t-statistics?
Difference between the means divided by the pooled or unpooled standard error of the mean
What is confidence interval?
Degree of uncertainty: area around the sample statistics where the corresponding population parameter is likely to be
True/False
The larger the sample, the smaller the CI
True
Greater likelihood that the sample statistics approximates the population parameter
True/False
If CI contains 0 (null value) then the means are not statistically different (non-significant finding)
True
How to report independent t-test results?
Report the means and standard deviations for both groups, t-value, degrees of freedom, and p-value.
o E.g. Males (mean ± StD): 24.0 ± 4.1, Females (mean ± StD): 22.9 ± 4.0; t = 1.3, df = 116, p = 0.20
P-values from t-test often reported in subject characteristics table when data is compared between two groups (e.g. males vs. females, intervention vs. control group)
Analysis of Variance ANOVA
Test to determine if means differ between 3 or more groups o Unlike t-tests, uses variance to assess differences
• Test null hypothesis that variances between the groups are equal
o F-test will result in rejection of null hypothesis when variability between group means is sufficiently larger than variability
within the groups.
F-statistics
If ratio is large, indicates not all means are equal –> significant p-value will result
In an F-test what does it mean to reject the null hypothesis?
variability between group means is sufficiently larger than variability within the groups
Degrees of freedom in ANOVA
• Df1: df associated with the number of the F-statistics o Df=k–1
o K: # of group means
• Df2: df associated with the denominator of the F-
statistic
o Df=n–k
The first one represents between group variability, so this is calculated by k-1 (k is the number of groups). Nominator is the within group variability and that is calculated by n-k (sample size – the group means).
What does variability look like in SPSS?
if the group has low group variability within the dots are close together. if there is high variability between the groups then the dots are more dispersed, so you are more likely to find a significant difference with high variability between your groups.
If the dots are more spread apart there will be more variation less dispersion more precision high variability within group, the difference between the groups are larger than the difference within the groups, you will have a significant finding.
True/False
Low F value, less variability between your groups will lead to a retainment of the null hypothesis.
True
What are the assumptions of ANOVA?
- Variable is normally distributed
- The errors are normally distributed
- The cases are independent from each other
- Variance homogeneity
How to report ANOVA results?
- Report means and standard deviations for all groups, as well as the F value, degrees of freedom, and p-value
- Test for this example:
o Analysis of variance indicated that the different physical activity groups report different levels of caloric intake: § Sedentary:1640.1±516.7 § Light:2030.8±570.9 § Moderate:1999.4±670.7 § High:2005.1±633.5 § F(3,144)=3.3,p=0.02