Logistic Regression Flashcards

1
Q

Limits of Linear Regression

A
  • Assumes response variable is normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In logistic regression, the coefficients are estimated

using a technique called _______________

A

Maximum Likelihood Estimation (MLE)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unlike the _________________ method
used by linear regression, finding a closed form for
the coefficients using MLE is not possible. Instead, the process is _________.

A

Ordinary least Squares (OLS)

Iterative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The ___________________________is an extension of
linear regression that allows for linear predictors to be
related to a response variable that is not normally
distributed by using a transformation or link function

A

Generalized Linear Model (GLM)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The link function used for binomial logistic regression is called the _________

A

Logit / Log-odds function

log ( p / ( 1 - p) ) where p is a probability

Maps probabilities (0, 1) to (-inf, +inf)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In Logistic Regression

For every unit increase in
tumor size, the odds of it
being malignant changes
by a multiple of ___

A

e ^ Beta where Beta is the coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
If Beta<0 , the odds
that the tumor is
malignant \_\_\_\_\_\_\_ as
tumor size increases.
If Beta>0, the odds
that the tumor is
malignant \_\_\_\_\_\_\_\_ as
tumor size increases.
A

decreases

increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

An estimate of the relative information lost by a given
model: the less information a model loses, the higher
the quality of the model

A

Akaike Information Criterion (AIC)

AIC = 2k - 2 ln ( L )
Where L is the maximum value of the likelihood function for the model
k is the # of estimated parameters of the model

Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value. Thus, AIC rewards goodness of fit (as assessed by the likelihood function), but it also includes a penalty that is an increasing function of the number of estimated parameters. The penalty discourages overfitting, which is desired because increasing the number of parameters in the model almost always improves the goodness of the fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Strengths of Logistic Regression?

A
-Outputs have a nice probabilistic
interpretation.
-Can be regularized to avoid
overfitting.
-Easy to implement and use.
-Very efficient to train.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Weaknesses of Logistic Regression?

A
-Makes strong assumptions
about the data.
-Does not do well with missing
data.
-Tends to underperform when
there are multiple or non-linear
decision boundaries.
-Does not naturally capture
complex relationships.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly