Biostats-ANOVA Flashcards
ANOVA
- Used for hypothesis testing for more than 2 means
- an extension of the 2 sample independent groups t-test
- Two sources of variability in the outcome variable:
(1) within-group and (2) between group - The two groups are compared that that is where the name analysis of variance originates
- The outcome is Continuous and assumed to follow NORMAL distribution
- The independent variable is usually categorical
- The independent samples are usually drawn from at least 2 populations OR for experimental studies, a single random sample is drawn and randomly assigned to an intervention
Hypothesis Testing for more than 2 means
H0: m1 = m2 = m3 = … = mk
H1: Means are not all equal
F test statistic
F=Between group variability/Within group variability
-Can use the F table to find the cut off values
ANOVA decision rule
- When using an alpha of 0.05 reject the null hypothesis if F is greater than/equal to 95th percentile fo F distribution
- F values near 1 support the null
- F values greater than 1 support the alternative
Parts of the ANOVA Table
- ANOVA divides total variability into different components: Between and Within treatments
- Mean Square: the estimated variance obtained by dividing the sum of squares by the degrees of freedom
Mean squared between groups
A measure of between group variability
it measures how much observations will vary when receiving different treatments
Mean standard error
-An average measure of within group variability (measure of variability in observations when individuals are treated alike)
What statistic provides a measure of how much of the total variation in the response variable can be explained by the indpendent factor
R squared
-values range from 0-1 or 0% to 100%
=Sum of Squares (Between)/Sum of Squares(Total)
Post-Hoc Analysis
- Used when the F-test in the ANOVA is rejected and you want to know more about the relationships among the means (i.e. which is largest and smallest)
- Post-Hoc analysis is done in pairwise manner (comparing two-at-a-time)
Experimentwise Error Rate
- occurs when more and more tests are done on the same data
- inflates the Type 1 error rate
- can adjust pair-wise comparisons by taking the alpha level and dividing by #comparisions to be make (alpha level / # comparisons)
- –This is called Bonferroni or Dunn Test
- Remember when controlling for type 1 error rate, the type 2 error rate will likely increase
3 most widely used methods for Post-Hoc analysis
Fisher’s LSD
Tukey’s HSD
Bonferroni’s adjusted t-test
Fisher’s LSD
- Performed for every combination of means
- It is a slight modification to the 2 sample independent groups t-test
- It is an unadjusted two sample t test
- Mean squared error is used as a meas of sampling variability
- Degrees of freedom are N-k (where N=sample size and k=number of treatment groups)
Post-Hoc Analysis: Fisher’s LSD
- To find out if the means are significantly different take the absolute difference of the means and compare them to the LSD value that is calculated for each group.
- If the absolute difference of the sample means is greater than the LSD value, then the difference between the means is statistically different
Dunnett’s Test
Compares group means to the control group mean
-for the degrees of freedom, use k-1 and N-k
Two-Factor ANOVA
- Comparing means of a continuous outcome across two grouping variables or factors
- Essentally examining interaction
- Overall test: is there a difference in cell means?
- Factor A: do marginal means of factor A differ?
- Factor B: do marginal means of factor B differ?
- Interaction: are there differences in the means across levels of factor B for each level of factor A