06.b Logistic Regression Flashcards
What is Logistic Regression
The logistic regression is a supervised predictive analysis. Logistic regression is used to describe data and to explain the relationship between one dependent categorical variable and one or more nominal, ordinal, interval or ratio-level independent variables by estimating probabilities using a logistic function.
What type of output variable comes from Logistic Regression
When the outcome variable is categorical in nature, logistic regression can be used to predict the likelihood of an outcome based on the input variables.
Name four use cases for Logistic Regression
Medical
Finance
Marketing
Engineering
What shape is the common Logistic Curve
An S Shape curve. Bottom left is zero, top right is One, with an S Shape joining the two corners
What is the Logistic Function (equation)
f(y) = e^y / (1+e^y) for -infinity < y < +infinity
What is MLE in terms of Logistic Regression
MLE stands for Maximum Likelihood Estimation
What does churn mean
Churn refers to the likelihood of a customer will switch to another company
Which function should you use for Logistic Regression in R
The Generalised Linear Model function glm()
OutputDF = glm (Churned ~ Age + Married + Cust_Years+Churned_Contacts, data=churn_input, family=bionomial(link=”logit”))
Describe Odds
The Odds of something happening are the chances of A happening divided by the chances of B happening.
Describe Probability
The Probablity of something happening are the chances of A happening divided by the chances of all possible results.
Once you have calculated the Generalised Linear Model for y which equation should you use to calculate the probability
p = e^y / (1-e^y)
What is the Akaike Information Criteria (AIC)
You can look at AIC as counterpart of adjusted r square in multiple linear regression. It’s an important indicator of model fit. It follows the rule: Smaller the better. AIC penalises increasing number of coefficients in the model. It helps to avoid over-fitting.
In Logistic Regression what is the Null Deviance
The Null Deviance is the value where the likelihood function is based only on the intercept term
In Logistic Regression what is the Residual Deviance
The Residual Deviance is the value where the likelihood function is based on the parameters in the specified logistic model
In Logistic Regression how do you calculate a Pseudo - R squared
Pseudo R Squared = 1 - (residual dev. / null dev.)