LOGISTIC REGRESSION Flashcards
What is the distribution of a logistic regression?
Bernouilli
What’s the formula that is the essence of logistic regression?
log[p/(1-p)]
also, odds = p/(1-p)
What is the range of odds?
0 to infinity
What is the range of log(odds)?
- infinity to + infinity
So why do we use logit?
Because it enables to go from probabilities (0,1) to the range of log, that are infinite both ways, making this a more natural space for a linear model
(we’re not transforming the outcome, but the probability)
Key assumptions of logistic model (2)
- Outcome is binary
- Observations are independent
Probability =
exp(b0+bx)/[1+exp(b0+bx)]
How do we estimate the coefficients?
With a maximum likelihood function, which looks the set of coefficients that make the observed responses maximally likely
Unlike least squares, no closed-form solution to problem, so it’s found by trial and error
When does the explanatory variable as a significant effect?
When the beta is more than 2 standard errors away from 0
- but since p is not linear with X, the same change has a more drastic impact on p towards the center of the p-range than in the extremes
The deviance of a model is…
-2*loglikelihood of the data under the model considered
- the smaller the deviance, the better the fit
- decreases when we add parameters
We can get the RR by…
Predicting the risk of the outcome