5. Categorical Predictors Flashcards
What is a categorical variable?
Only take discrete values that aren’t numerically meaningful
How should categorical variables be coded in r?
As Factor
What is meant by a binary variable/dummy variable?
Categorial variable with two levels (coded for as 0 and 1)
What is beta 0 hat in dummy coding?
Predicted value of y when x = 0
(mean of group coded 0)
What is beta one hat in dummy coding?
Predicted difference between the means of two groups
(Group 1 - Group 0)
What are the prediction equations for binary variable?
For 0 variable = Beta 0 hat
For 1 variable = Beta 1 = Group 1 - beta0
What does each beta coefficient mean in categorical variables with two levels?
Want each beta coefficient = Represents specific difference between means
What is deemed a good baseline in dummy coding?
Control group
Group expected to have lowest score on outcome
Largest Group
What is a bad baseline in dummy coding?
Poorly defined level
Much smaller than other groups
How many dummy variables will we have in relation to the number of predictors?
k-1 dummy variable
How is dummy coding, coded in r?
Coded contr.treatment
- This specifies dummy coding and can change the reference level within r