Logistic Regression Flashcards

Question 1

Q

General characteristics Logistic regression

Answer

A

Outcome must be between 0 and 1.
Sigmoid logistic function used to model dependent variable.
Coefficients estimated through MLE.

Not possible to interpret the coefficients directly because it is on an exponential basis.

logit = logarithm of the odds, the log-odds, coefficients are the change in the log-odds.

Question 2

Q

Odds Ratio

Answer

A

Divide the odds of the first group by the odds of the second group.

Odds of being interested in the product if the client is active, over the Odds of being interested in the product if the client is NOT active.

Question 3

Q

Regularisation

Answer

A

Overfitting, when data has high dimensions and is sparse.

Regularisation penalises large values of the estimates. If weights are large, a small change in feature can lead to a large change in the prediction, so we use the penalised log-likelihood function.

Question 4

Q

L1 regularisation

Answer

A

Lasso Regression, Linear. Can also be used to do feature selection (automatically select important predictors in the model) by forcing some coefficients to be zero.

Question 5

Q

L2 regularisation

Answer

A

Ridge Regression. More “soft”, no feature selection. If you think all predictions are important, use this one.

Question 6

Q

Setting a threshold

Answer

A

Quantify the costs of the misclassifications (FP and FN), this way you can allocate a budget for the misclassification of the model and use ROC curve.

Question 7

Q

Techniques for dealing with unbalanced data

Answer

A

Oversampling: SMOTE
Undersampling.
Weighted logistic regression.

Logistic Regression Flashcards

(7 cards)