15. Binary Logistic Model and Logistic Regression Flashcards
What is logistic regression?
It is a statistical analysis method to predict the binary outcome. It predicts a dependent variable by analysing the relationship between one or more independent variables.
What is a binary outcome variable?
When a response (y) is binary coded (e.g. yes or no)
Predictors can be continuous or categorical
What is binary logistic regression?
Binomial Logistic Regression is the statistical fitting of an s-curve logistic or logit function to a dataset in order to calculate the probability of the occurrence of a specific event, or Value to Predict, based on the values of a set of independent variables.
Why do we not use linear regression with a binary outcome variable?
Distributions of a residual would be bimodal
Variation of residuals would not be constant
Relation of X and Y is not linear
Probabilities wouldn’t be contained between 0 & 1
How does logistic regression solve the issues that occur in linear regression of a binary outcome?
The goal of logistic regression is to output values between 0 and 1, which can be interpreted as the probabilities of each example belonging to a particular class.
What function in r would be used to demonstrate binary logistic regression?
glm(y ~ x1 + x2, data = data, family = binomial)
What does the logistic regression model predict?
Predicts the probability that y = 1 (as y can either = 0 or 1)
What is the logistic regression model equation?
P(yi) = 1/ 1 + e - (B0 + B1xi1)
E = Exponential
B0 = Intercept
B1 = Capturing effect of x1 on outcome y
Why are odds and log odds important in logistic regression?
Log odds convert the Logistic Regression which is a probability-based model to a Likelihood–based model so it allows the estimation of model coefficients
Log Odds are equivalent to b0 + b1 + …..
What are odds and what is the odds equation?
Odds of event occurring = Ratio of the probability of event occurring to the probability of event not occurring
P(Y=1)/1-P(Y=1) can range from 0 to infinity
What are log odds and its equation?
Log odds are the natural logarithm of the odds
Logits correlate to an odd and a probability (e.g. -2.21 correlates to a 0.1 chance)
Every probability can be easily converted to log odds, by finding the odds ratio and taking the logarithm.
How can non-linear data be converted to make it linear?
- Convert to probability
- Get odds
- Take log odds
How are logistic regression coefficients estimated?
Logistic regression models are estimated using maximum likelihood estimation (MLE)
MLE finds the logistic regression coefficients that maximise the likelihood of the observed data having occurred
Larger log-likelihood values indicate poorer fitting models
How does the method of MLE differ the the method of least squares used in linear regression?
While least squares estimation minimises the SSE to find the coefficients for the line of best, MLE minimises the log-likelihood
What is the test used to evaluate our overall model in logistic regression?
Likelihood ratio test/ Chi-squared difference test
Compare our model to a baseline model with no predictors (null model)
Assess the improvement in fit