L6 - Regression Analysis III Flashcards
What do the Aikake information criteria (AIC) and the Bayesian information criteria (BIC) take into account?
Model complexity.
Are lower or higher values for AIC and BIC better?
lower values are better.
What do you use if the DV is categorical/binary?
logistic regression
How does the logistic function look like?
how does the regression equation look like in logistics regression?
Does the slope in the regression equation in logistics regression stand for the change in log odds associated with a one-unit change in X?
Yes
How does the intercept change when you have a smaller value of ß0?
It moves to the right
How does the slope change when you have a smaller value of ß1?
for negative values it is downwards sloping
How to get probability from ods?
odds/ (1 + odds)
what are the log odds of odds of 1:1?
log (1/1) = 0
how to get odds from log odds?
exponeniate
What does the .357 tell you?
If you have a weight of 70 then was is the probability of being male? The odds are .52
You just have to put odds/ (1+ odds)
p(survival) female, 40years, 2nd class?
What is the Wald statistic used for?
To see if a regression weight is significantly different from zero.
- you need the regression coefficient and standard error of the coefficient
What is the log likelihood used for (LL)?
- about the likelihood of the data we have observed
- we want to LL to be as high as possible
What does Chi-squared measure? model fit.
(deviance baseline (only intercept)) - (deviance model)
-> deviance should be as small as possible
–> high X^2 means that model is way better than baseline
What is Cox & Snell R squared?
- ranges from 0 to 1
- 1 is the best
- to measure model fit
What is the Nagelkerke R-squared?
- larger R^2 than Cox and Snell
- to measure model fit
Why are AIC and BIC better than R^2?
Because a high fit is not always good -> AIC and BIC let you evaluate complexity vs. model fit
What is a special assumtion for logistics regression?
no complete separation
–> logistics regression does not know how to handle a gap in between. It is better if there is an overlap
How many events should you have per predictor in logistics regression
min 10
(events are the cases in the less frequent category of the DV)