Classification Flashcards
Why doesn’t linear regression work for qualitative data?
Theoretically, we can assign numerical values to qualitative response variables, but this would imply an ordering on the outcomes, insisting the difference between each of the variables are the same. Interpreting the estimates in this case would be very difficult.
Logistic Regression
Helps with modeling qualitative data with binomial categorical responses. Want our model to provide predictions between 0 and 1.
What will help fit a logistic regression?
Maximum Likelihood
Formula for logistic regression
P(x) = e^B0 + B1 (x) / 1 + [e^B0 + B1 (x)]
Z-Statistic
Large absolute value of z-statistic indicates evidence that our coefficient is significantly different than zero – keep in the model.
P-values
Looking for values less than 0.05
Difference between logit and probit models
Logistic uses the cumulative distribution function of the logistic distribution and probit uses the cumulative distribution function of the standard normal distribution.
Heteroskedasticity
Variance of error terms is not constant across different observations. Probit models generally account for this.
Rand Accuracy
The test to use to evaluate the test sample performance of the logistic and probit models. Not make sense to use R^2 since logit and probit models are non-linear.