week 2 - introducing categorical predictors into a multiple regression model Flashcards

1
Q

what is the raw data of multiple regression

A
  • continuous or categorical predictors
  • Intercept = value of the outcome variable when all continuous predictors are at zero and categorical predictors are at their reference level
  • Interpretation for continuous coefficients = A one unit increase in X1 gives a change in Y by the amount of b1
  • Interpretation for categorical coefficients = A change to another category within X2 gives a change in Y by the amount of b2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are examples of categorical variables

A
  • sex or smoker
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how does R manage categorical variables

A
  • uses dummy or treatment coding
  • level coded as 0 = variable reference level
  • this is part of the intercept term
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

why would you change the reference level of a variable

A
  • hypothesis means a different reference level makes it easier to interpret
  • useful for more than two levels
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does mean centring mean in relation to continuous predictors

A
  • the intercept now represents the average value when all continuous predictors are at their average value
  • subtract the mean value of a variable from every observation in that variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is meant by standardising

A
  • taking centred variable and divide each observation by one standard deviation of the variable
  • allows for direct comparison of the coefficients for continuous and binary predictors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what about categorical variables with more than two levels

A
  • each levels coefficient is the difference between the reference level and one other level (dummy/treatment coding)
  • each levels coefficient is the difference of the level from the intercept (sum coding)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is ordered variables

A
  • use ordinal regression model
  • each level of the coefficient is the difference between those levels or levels below them
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is dummy coding

A
  • reference level becomes hidden in the intercept for the categorical level and one other level
  • switching from one level of the categorical predictor to the second level is the same as moving along one unit of a predictor variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is sum coding

A
  • intercept now reflects 0 rather than the reference level
  • switching between one level to the second is same as moving along two units of a continuous predictor
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are standardising variables

A
  • centre a predictor and then divide the observation by the standard deviation of the variable
  • each standardised variable now predicts the change in the outcome variable for a one standard deviation increase in the predictor variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly