week 2 - introducing categorical predictors into a multiple regression model Flashcards
1
Q
what is the raw data of multiple regression
A
- continuous or categorical predictors
- Intercept = value of the outcome variable when all continuous predictors are at zero and categorical predictors are at their reference level
- Interpretation for continuous coefficients = A one unit increase in X1 gives a change in Y by the amount of b1
- Interpretation for categorical coefficients = A change to another category within X2 gives a change in Y by the amount of b2
2
Q
what are examples of categorical variables
A
- sex or smoker
3
Q
how does R manage categorical variables
A
- uses dummy or treatment coding
- level coded as 0 = variable reference level
- this is part of the intercept term
4
Q
why would you change the reference level of a variable
A
- hypothesis means a different reference level makes it easier to interpret
- useful for more than two levels
5
Q
what does mean centring mean in relation to continuous predictors
A
- the intercept now represents the average value when all continuous predictors are at their average value
- subtract the mean value of a variable from every observation in that variable
6
Q
what is meant by standardising
A
- taking centred variable and divide each observation by one standard deviation of the variable
- allows for direct comparison of the coefficients for continuous and binary predictors
7
Q
what about categorical variables with more than two levels
A
- each levels coefficient is the difference between the reference level and one other level (dummy/treatment coding)
- each levels coefficient is the difference of the level from the intercept (sum coding)
8
Q
what is ordered variables
A
- use ordinal regression model
- each level of the coefficient is the difference between those levels or levels below them
9
Q
what is dummy coding
A
- reference level becomes hidden in the intercept for the categorical level and one other level
- switching from one level of the categorical predictor to the second level is the same as moving along one unit of a predictor variable
10
Q
what is sum coding
A
- intercept now reflects 0 rather than the reference level
- switching between one level to the second is same as moving along two units of a continuous predictor
11
Q
what are standardising variables
A
- centre a predictor and then divide the observation by the standard deviation of the variable
- each standardised variable now predicts the change in the outcome variable for a one standard deviation increase in the predictor variables