Chapter 11 part a: GLM1 Flashcards
ANOVA
- Linear model to compare several means
Predictor with 2 categories
- b
- represents the difference between the mean of 2 categories
- is the difference statistically different?
Predictor with more than 2 categories
- b
- we have to create dummy variables so that b compares differences between two means
- each dummy variable will have 2 categories
ANOVA vs Regression
- we use ANOVA to test fit of regression line
- ANOVA: special case of linear model (regression)
- Equation: Same
Outcome=model+error
Important in ANOVA:
- baseline category sample size must be fairly large so b retains reliability
ANOVA Equation
Outcome = bo + b1Dummy1 +b2Dummy2
- bo: mean of base category
- b1: difference between control mean and 1st group to compare to mean
- b2: difference between control mean and 2nd group to compare to mean
ANOVA: F significant
- using group means to predict is better than using overall mean
Logic of F-ratio
- if group means are the same: our model is poor [F small]
- if group means are different: our model is good [F large]
- F: whether group means are different
Logic of ANOVA
- Simplest Model: Grand Mean of outcome: No effect
- Intercepts and parameters describe the model
- Parameters: shape of fitted Model
—> the bigger the coefficients (b): the greater the deviations between model & grand mean - Experimental research: b represents the difference between group means
- If the difference between groups is large enough then our model is better than grand mean
Total Sum of Squares [SST]
- total amount of variation
- SST=sum(obs. data-grand mean)^2
Grand Variance
- variances between all scores regardless of experimental condition
- grand variance (s^2)=SST/(N-1)
- SST=grand variance x (N-1)
Model Sum of Squares [SSM]
- variation explained by the regression model
- SSM=sum[nk(group mean-grand mean)^2]
- dfM= k-1
Residual Sum of Squares [SSR]
- variation that can NOT be explained by our model
- variation caused by extraneous factors
—> SSR= SST-SSM —> SSR=sum(xk- group mean)^2 —> SSR= SSR1+ SSR2+ SSR3... —> SSR=sum[sk^2 (nk-1)] - variance of each group x (nk-1) —> dfR= dfT-dfM —> dfR= N-k
dfT
N-1
dfM
k-1
dfR
N-k
ANOVA
- Omnibus test: whether explained variance is larger than unexplained variance
- significant F: means of categories are not equal (several scenarios possible)
—> manipulation has had some effect but what the effect is: Does NOT say
Mean Squares
- why do we need them?
- SS are biased
- MS eliminate their biases
Model Mean Square
- SSM/dfM
- systematic variation
Residual Mean of Squares
- MSR=SSR/dfR
- unsystematic variation
F-ratio
- how well the model is against how bad it is
- F= MSM/MSR
- F>1: if model good but check its significance
Assumptions of One-WAy ANOVA
- Linearity
- Normality
- Homoscedasticity: check using Levene’s and correct using Brown or Welch
~ F(bf): SSR=sum[sk^2 (1-nk/N)]
~ both techniques control for Type 1
~ Welch: best except when very large variance - Independence of Errors
Is ANOVA robust?
- controls somewhat for Type 1
- controls for skew, kurtosis, non-normality for 2tailed
- less control for 1 tail
- leptokurtic: type 1 error too low
- platykurtic: type 2 too high
ANOVA Robust to Normality?
- Yes, when group sizes are equal
- When group sizes are unequal: F and power may be affected
ANOVA robust to heteroscedasticity?
- Yes, when sample sizes are equal
- No, when sample sizes are unequal