W7: One-Way Between-Subjects ANOVA Flashcards
What is dummy coding
Dummy coding
- Transforms categorical variable with g categories into a meaningful set of g - 1 dummy variables that each have values of either 0 or 1 .
- e.g 3 categories = 2 dummy variables with values either 0 or 1
Why do we need dummy coding (in a linear regression)?
Value without dummy coding:
- Sum of squares in ANOVA Table will be incorrect
- Regression coefficients will not be meaningful
- Observed R-square value will differ, depending on which category is assigned to which value
- depending on which 1,2,3,4,5 is assigned
What is the reference category
- For value “0” of all dummy variables
- Each dummy variable is compared against this
Rows/Column = Which is dummy?
Row
- Factor
Column
- Dummy
In ANOVA and Linear Regression Outputs of R, What is similar/dissimilar? Are dummy variables or contrasts better?
ANOVA is akin to dummy-coded linear regression
Similar
- Both shows relevant F distribution and dfs
- Both gives us proportion of observed DV explained by IV
- R<span>2</span> shows proportion of observed DV explained by group differences
- eta2 (not examinable) also shows proportion of observed DV explained by group differences
Dissimilar
- Linear Regression tell us where difference lies
However, using dummy variables might not always reflect research questions. Contrasts gives more flexibility
How are group differences, ANOVA and linear regression related?
- Investigate the extent to which variation on DV can be accounted for by variation in group means
- Regression: SStotal = SSreg + SSres
- ANOVA: MSbetween/MSwithin
- MSbetween = SSbetween / df
- MSwithin = SSwithin / df
- Group differences represents change from dummy coded “0” to “1”
What is the “One-Way” design.
- One-way
- One IV
- One group classification
What is “between-subject” design
- Groups are independent
- Mutually-exclusive
What is null hypothesis in ANOVA. What is it also called
H0: μ1 = μ2 = μ3 / μ1-μ2-μ3=0
Omnibus hypothesis.
- Evidence against it does not tell us which groups differ.
- At least one unidentified group mean is different from all remaining group means.
Why is a focused investigation betters? 3 reasons why?
- Often, we are able to propose a priori research questions for the specific ways that differences may occur
- Provides identifiable differences
- Can explain everything in the omnibus approach (under certain conditions), hence more informative
When we have k groups, how many fundamental differences can we find? Why?
k-1 fundemental differences
- Degrees of freedom
- Due to transitivity
What is linear contrast
- A set of weights that sum to zero is called a linear contrast.
- Net effect
- Difference between means of positively-weighted objects and means of negatively-weighted objects.
Why would you not want to use the contrasts by R?
What are the rules of linear contrasts?
Focused research questions many not correspond to in-built contrasts by R
- Individual values in contrast can be -ve , 0, +ve
- Positive or negative is arbitrary
- Coefficient values in a contrast sum to 0
- There will be as many contrast coefficients as there are groups
- e.g. 5 groups, 5 coefficients
-
Maximum number of contrasts is k-1
- One less than levels of a factor
- e.g. 5 groups, 4 number of contrasts
In 5 groups, how many contrasts and contrast values should there be?
- 4 Number of Contrasts
- But the contrasts should contain 5 values (can be fractions as long as they sum to 0)
What is a useful property of some contrasts
Orthogonality (being uncorrelated)
Useful for some, NOT ALL, contrasts
- Sum of cross-products of 2 contrasts = 0
- (+2, -1, -1 , 0)
- (0, +1, -1, 0)
- (+2)(0) + (-1)(1) + (-1)(-1) + (0)(0) = 0
Why is orthogonality a useful property in some contrasts
-
Balanced Designs + Orthogonality
- Mean differences in each contrast do not overlap and do not contain redundancy.
Are all linear contrasts orthogonal?
No. May be some instances linearcontrasts are not orthogonal
What is the function for constructing CIs for user-defined contrasts
ci.lc.stdmean.bs
- Gives observed mean contrast
- Gives standardised mean contrasts
- g
- d
Assumptions for a independent groups with >2 groups. Which is the most important
- Independence of observations.
- Normality of observed scores.
- Homogeneity of group variances (most important)
- Assessed by Levene’s and/or flinger-kileen.
In calculating observed mean difference, what is the decision tree like
- Balance
- Homogeneity
- Normality
In calculating standardized mean difference, what is the decision tree like
- Balance
- Normality
- Homoegeneity
What happens if the design is orthogonal but unbalanced
SSmodel will not be equal to the sum of SScontrast
i.e. all the SScontrasts (variation explained by contrasts) added together will not be equal to SSmodel (variation explained by model)