Categorical Predictors Flashcards
What assumption is violated?
Linearity because it can’t be linear if there are categories
What does loaded link function mean?
Takes a categorical outcome and makes it linear
What do we predict?
The probability of an outcome occurring
What happens to the linear model?
We transform it so it is now predicting something different
we predict the log odds
What does log odds mean?
The probability of an event occurring to not occurring
What does B0 represent?
The odds of outcome when the predictor is 0
What does B1 represent?
The change in the log odds of outcome associated with a change in predictor
Bigger than 1 = as the predictor increases, the probability of an event occurring increases
Less than 1 = as the predictors increases, the probability of the event occurring decreases
The odds ratio
This is the same as expB
odds after a unit change in the predictor divided by original odds
the change in odds as we change from one condition to another
odds in 1 condition divided by odds in second condition
if below 1 = odds of outcome after 1 thing is smaller than odds after something else
How do you predict odds?
this is the number of times an event occurs compared to the number of times it doesn’t occur
Number of times something happened divided by number of times something didn’t happen
What can go wrong - things we’ve met before
Linearity - overcome by using logit
Spherical residuals - still only need one set of data
Multicollinearity - shouldn’t be too highly correlated with each other
What can go wrong - unique problems
Incomplete information
Complete separation
What does incomplete information mean?
There will be empty cells in the data, some cells where nothing happened
This problem escalates with continuous predictors, SPSS tries to estimate it but this may fail, if the SE are really high, shows error
What does complete separation mean?
No one model fits it the best
The data is completely seperated
When the outcome variable can be perfectly predicted
What does percentage correct mean?
The amount of cases which have been correctly identified
What happens if B values are close to 1?
Null effect