Fitting a Logistic Regression Model Flashcards

1
Q

What method does PROC LOGISTIC use for estimating the unknown parameters in a logistic regression model?

A

PROC LOGISTIC uses the maximum likelihood method for estimating the unknown parameters in a logistic regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

TRUE or FALSE: Higher percentages of concordant pairs and lower percentages of discordant and tied pairs indicate a more desirable logistic regression model

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

PROC LOGISTIC is appropriate for what type of response variable?

A

binary, ordinal, or nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the maxpoints=none option do when using PROC LOGISTIC?

A

The maxpoints=none option requests that no observations be displayed in the effect plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What PROC LOGISTIC option displays confidence intervals on the plots?

A

CLBAND

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do goodness of fit measures allow you to do?

A

Compare one model to another (The smaller the values of the goodness-of-fit statistics, the better fit of the model to the data.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which PROC LOGISTIC table shows which input variables are statistically significant, controlling for all of the other input variables in the model. ?

A

The Type 3 Analysis of Effects table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What can be used to give an approximate ranking of the relative importance of the input variables on the fitted logistic regression model?

A

The absolute value of the standardized estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the estimate in maximum likelihood estimate table measure ?

A

the rate of change in the logit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The Association of Predicted Probabilities and Observed Responses table provides what kind of information?

A

several measures that assess the predictive ability of the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How should you interpret the four rank correlation indexes: Somers’ D, Gamma, Tau-a, and the c statistic?

A

a model with higher values for these indexes– the maximum value is one– has better predictive ability then a model with lower values for these indexes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is posterior probability?

A

The probability that y equals 1 given the inputs. The term posterior means that the probability is calculated after you provide the input information to the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Logistic regression is a special form of what kind of model?

A

A generalized linear model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an odds ratio?

A

The odds ratio is the ratio of two odds.

The odds ratio compares the odds of the event in one group to the odds of the event in another group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the logit transformation do?

A

The logit transformation takes the log of the odds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the parameter estimate measure?

A

The parameter estimates measure the rate of change in the logit

17
Q

What PROC LOGISTIC statement enables you to obtain an odds ratio estimate for a specified change in a predictor variable?

A

the units statement

18
Q

What PROC LOGISTIC statement enables you to obtain an odds ratio estimate for a specified change in a predictor variable?

A

the UNITS statement

19
Q

How does the offset correct for oversampling?

A

The intercept is reduced by the offset value

20
Q

What is the process of detecting nonlinear relationships between the target variable and the input variables referred to as?

A

Variable screening

21
Q

What do we hope to accomplish by variable screening/

A

We hope that variable screening will further reduce the number of inputs by identifying those that are clearly irrelevant.

22
Q

What statistics should you request on a PROC CORR statement as part of variable screening?

A

Spearman correlation statistic (SPEARMAN option) Hoeffding’s D statistic (HOEFFDING option)

23
Q

A high Spearman rank and low Hoeffding D rank suggests what type of relationship with the target?

A

A non-linear relationship (non-monotonic) When the Spearman rank is high, the association is monotonic, regardless of whether the Hoeffding value is high or low.

24
Q

A low Spearman rank and high Hoeffding D rank suggests what type of relationship with the target?

A

A linear relationship (monotonic)

25
Q

What is a monotonic relationship?

A

one variable consistently increases or consistently decreases with respect to the other variable

26
Q

TRUE or FALSE: A monotonic relationship can be linear or curvilinear.

A

TRUE

27
Q

TRUE or FALSE: A linear model can adequately model a monotonic relationship, whether it is linear or curvilinear.

A

TRUE

28
Q

Why is the Spearman correlation statistic used in place of the Pearson correlation statistic during variable screening?

A

Spearman is a better choice for this technique because it is less sensitive to nonlinearities and outliers.

29
Q

What type of variable is input variables are used in variable screening?

A

The variables must be ordinal or you must be able to treat them as ordinal (for example, continuous or binary variables).

30
Q

A low Spearman rank and low Hoeffding D rank suggests what type of relationship with the target?

A

A weak association indicating an irrelevant input