W7: One-Way Between-Subjects ANOVA Flashcards
What is dummy coding
Dummy coding
- Transforms categorical variable with g categories into a meaningful set of g - 1 dummy variables that each have values of either 0 or 1 .
- e.g 3 categories = 2 dummy variables with values either 0 or 1
Why do we need dummy coding (in a linear regression)?
Value without dummy coding:
- Sum of squares in ANOVA Table will be incorrect
- Regression coefficients will not be meaningful
- Observed R-square value will differ, depending on which category is assigned to which value
- depending on which 1,2,3,4,5 is assigned
What is the reference category
- For value “0” of all dummy variables
- Each dummy variable is compared against this
Rows/Column = Which is dummy?
Row
- Factor
Column
- Dummy
In ANOVA and Linear Regression Outputs of R, What is similar/dissimilar? Are dummy variables or contrasts better?
ANOVA is akin to dummy-coded linear regression
Similar
- Both shows relevant F distribution and dfs
- Both gives us proportion of observed DV explained by IV
- R<span>2</span> shows proportion of observed DV explained by group differences
- eta2 (not examinable) also shows proportion of observed DV explained by group differences
Dissimilar
- Linear Regression tell us where difference lies
However, using dummy variables might not always reflect research questions. Contrasts gives more flexibility
How are group differences, ANOVA and linear regression related?
- Investigate the extent to which variation on DV can be accounted for by variation in group means
- Regression: SStotal = SSreg + SSres
- ANOVA: MSbetween/MSwithin
- MSbetween = SSbetween / df
- MSwithin = SSwithin / df
- Group differences represents change from dummy coded “0” to “1”
What is the “One-Way” design.
- One-way
- One IV
- One group classification
What is “between-subject” design
- Groups are independent
- Mutually-exclusive
What is null hypothesis in ANOVA. What is it also called
H0: μ1 = μ2 = μ3 / μ1-μ2-μ3=0
Omnibus hypothesis.
- Evidence against it does not tell us which groups differ.
- At least one unidentified group mean is different from all remaining group means.
Why is a focused investigation betters? 3 reasons why?
- Often, we are able to propose a priori research questions for the specific ways that differences may occur
- Provides identifiable differences
- Can explain everything in the omnibus approach (under certain conditions), hence more informative
When we have k groups, how many fundamental differences can we find? Why?
k-1 fundemental differences
- Degrees of freedom
- Due to transitivity
What is linear contrast
- A set of weights that sum to zero is called a linear contrast.
- Net effect
- Difference between means of positively-weighted objects and means of negatively-weighted objects.
Why would you not want to use the contrasts by R?
What are the rules of linear contrasts?
Focused research questions many not correspond to in-built contrasts by R
- Individual values in contrast can be -ve , 0, +ve
- Positive or negative is arbitrary
- Coefficient values in a contrast sum to 0
- There will be as many contrast coefficients as there are groups
- e.g. 5 groups, 5 coefficients
-
Maximum number of contrasts is k-1
- One less than levels of a factor
- e.g. 5 groups, 4 number of contrasts
In 5 groups, how many contrasts and contrast values should there be?
- 4 Number of Contrasts
- But the contrasts should contain 5 values (can be fractions as long as they sum to 0)
What is a useful property of some contrasts
Orthogonality (being uncorrelated)
Useful for some, NOT ALL, contrasts
- Sum of cross-products of 2 contrasts = 0
- (+2, -1, -1 , 0)
- (0, +1, -1, 0)
- (+2)(0) + (-1)(1) + (-1)(-1) + (0)(0) = 0