9 - Multiple Regression and Categories Flashcards
When should we consider categories for a regression analysis?
- When there are nominal/ ordinal predictor variables
- When there are 2+ groups/ categories for the predictors
What are the requirements to use groups in multiple regression?
All groups must be mutually exclusive and exhaustive.
Why is it important to recode categorical variables before entering them into the regression model?
Categorical values cannot be entered into a linear regression formula.
- Ex. “tall” and “short” mean nothing to the model.
Name 4 coding systems.
- Dummy coding
- Unweighted effects coding
- Weighted effects coding
- Contrast coding
What is dummy coding?
Way of coding categorical variables using 0s and 1s. The reference group is coded using all 0s
What does every regression coefficient in dummy coding stand for?
A comparison of that group’s mean to the reference group’s mean
What is unweighted effects coding?
Way of coding categorical variables where the base group is the group of least interest (gets all -1s).
What does every regression coefficient in unweighted effects coding stand for?
The deviation of the outcome for each separate group to the mean of the sample.
When is weighted effects coding useful?
When the proportion of cases from each group represents the population or when the sample size of the group is different.
What are the requirements for contrast coding?
- The sum of the weights across groups must be 0
- The sum of the products of each pair of code variables must be 0
- The difference between a positive set of weights and a negative set of weights must be 1
When is contrast coding useful?
When the means between groups are expected to be different
In unweighted effects coding, what is the mean difference for the base group?
The sum of all negative coefficients plus the intercept