Data Analysis IIa: ANOVA & Regression (Week 6) Flashcards
What is ANOVA?
To test more than 2 means (i.e. >2 groups)
Do groups 1, 2 & 3 have sig. different means for x̄?
What is the formula for F-test?
F = Between-grp variance/Within-grp variance
= Σnj (x̄j - x̄)^2 / (k-1) / ΣΣ (x- x̄j)^2 / (N-k)
What are the degrees of freedom for F-test?
df1 = k-1 df2 = N-k
What happens when we reject the null hypothesis for the F-test?
Null hypothesis: All groups have the same mean
Reject Ho -> Not all means are the same.
Which one differs? Conduct post-hoc.
What is an example of post-hoc tests?
To find out which means differ from each other
E.g. LSD
Comparable to a large set of t-tests
What is regression?
Calculate the distance from the observation to the fitted line
Regression MINIMISES the sum of these differences
What is sum of squares?
Sum of squares of distances from data pts to fitted line
We prefer the reg. line that gives the LOWEST sum of squares
What is the regression equation?
yi = α + β xi + εi
α: intercept/constant
β: slope
ε: disturbance/error term
How does α and β affect the graph?
Higher α -> Parallel shift of graph (affects INTERCEPT)
Higher β -> Steeper graph (affects SLOPE)
What is simple vs. multiple regression?
Simple: yi = α + β xi + εi
Multiple: yi = α + β1 x1i + β2 x2i + β3 x3i + εi
Why do we not do multiple simple regressions?
We want to test the effect of multiple variables AT THE SAME TIME
What is omitted variable bias?
Eg. Salesi = α + β1 Pricei + β2 Advertisingi + εi
If we omit price, the effect of advertising is not clean
What are the regression coefficients of the regression equation?
α, β1, β2
How do we choose which IVs to include in our regression equation?
- Use theory/intuition
- Do not just include all variables in your dataset
- For exploratory research: use stepwise regression
What are the 3 steps to interpret regression results?
- Model significance: F-test
- Model fit: R^2
- Regression coefficients: Significance, sign, size