module 11 Flashcards

1
Q

logistic regression model

A

predict a categorical response variable w 2 levels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

simple logistic regression model equation

A

phat (probability) = 1/1+e^-(intercept + slopeexplanatory variable)
odds = phat/1-phat = e^(intercept + slope
explanatory variable)
log odds = log(phat/ 1-phat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

intercept interpretation

A

“We predict the odds of INCLUDE ALL CHARACTERISITCS FROM VARIABLE of being SUCCESSFUL are NUMBER”
Interpret e^intercept: the baseline odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Numerical Explanatory Variable Slope interpretation

A

“All else held equal if we were to increase the VARIABLE by 1, then we would expect the odds of SUCCESS to increase by a multiplicative factor of NUMBER on average.”
Interpret e^slope: odds multiplier of the explanatory variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Indicator variable slope interpretation

A

“All else held equal we expect that the odds INTERACTION TERM INCLUDED is SUCCESSto be a multiple of ODDS RATIO times higher than the odds BASELINE LEVEL is SUCCESS, on average.”

Need to calculate the log of the ratio of these 2 odds
log(oddyes/oddno) = log(oddsyes) - log(oddsno)
Convert back to get odds ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pseudo R^2

A

R^2 = 1 - LLF_full/LLF_null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

LLF_full

A

The highest possible log likelihood function value that we could achieve with intercept and slope
LF_full = 1, so LLF_full = 0 ideally bc ln(1) = 0 and LLF_full is log(LF_full)
The closer to 0, the better the fit for the training set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Classifier model

A

set of rules that decide which of a set categories an observation belongs to, on the basis of training set of data containing observations whose category membership is known

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Predictive positive

A

True positive: observation predicted to be positive is actually positive
False positive: observation predicted to be positive is actually negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Predictive negative

A

True negative: observation predicted to be negative is actually negative
False negative: observation predicted to be negative is actually positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Confusion matrix

A

Predicted on top and actual on side

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Sensitivity rate

A

(true positive rate): TP/TP+FN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Specificity rate

A

(true negative rate): TN/TN+FP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

assumption for log

A

Response variable needs to be a categorical variable with 2 possible outcomes
The relationship between the log odds of success and the combination of Xs should be linear
The observations need to be independent (inference)
No multicollinearity between the x variables (slope interpretability)
The sample size is large enough that to support the normal approximation
No strong outliers or influential points (inference)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly