Wk 7 - Logistic Regression Flashcards

1
Q

What is the formula for linear regression? (x1, plus define components)

A

y’ = bx + c

predicted y = slope times x + constant (y intercept)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does b signify in regression formulas? (x3)

A

Slope
Coefficient
Amount of change in y for every unit change in x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the fit of a regression line maximised/evaluated? (x2)

A

Least squares criterion:

Want minimal residuals (diff between scores and line)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What 2 questions can we ask of any given regression model?

A

Q1: Does the predictor variable do anything useful?

Q2: Does the model provide a good fit to the data?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What can we conclude if b = 0 in a regression model? (x1)

A

Changes in x produce no change/effect in y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we assess model fit in linear regression? (x2)

A

Calculate r-square (proportion of variance accounted for)

And test for significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are b and r-square related in linear regression? (x3)

A

Generally linked to some degree,
But if you move all scores similar distances from line,
b stays the same while r-square goes to hell

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the major limitation of linear regression method? (x2)

A

Can’t deal with categorical data

‘All or nothing’ scores, rather than continuous predictions/outcomes available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does what we are trying to predict change when using categorical rather than continuous DV/y variable? (x2)

A

Want to assess the change in PROBABILITY of y given b change in x
Rather than change in y scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What statistical method enables regression with 2 categorical outcomes? (x1)

A

Binary logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
In linear regression:
Predictors are continuous or categorical
Outcome is continuous
Predictors assumed normally distributed
Deals with linear relationships among variables

Whereas in logistic regression? (x4?

A

Predictors are continuous or categorical
Outcome is categorical
Predictors not assumed normally distributed
Deals with non-linear relationships among variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the 2 applications/questions of logistic regression? (x1, x2)

A

Predict category people belong to, given predictors
Identify predictors of particular (categorical) outcome variable
*Outcomes are exhaustive and mutually exclusive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does the linear regression model change for logistic regression? (x3)

A

y’ becomes a logistic function (s-shaped curve) =
1 divided by
e raised to the power of the linear equation (v)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In logistic regression, if we substitute our largest x value for v… (x1)
And if v is very small… (x1)

A

y gets close to zero

y gets large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the statistical question asked by logistic regression? (x2)

A

How many units change in x does it take

To shift the odds from favouring particular category of y?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are odds? (x1)

A

Expression of relative probability of an event happening vs. not happening

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How are odds calculated? (x1)

What happens to probability if you double the odds of an event occurring? (x2)

A

odds = p(event)
Divided by 1 - p(event)

Increases, but with diminishing returns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Interpreting odds:

If odds = 1… (x1)

A

Two outcomes are equally likely

19
Q

Interpreting odds:

If odds > 1… (x1)

A

Target outcome is more probable than the alternative

20
Q

Interpreting odds:

If odds < 1… (x1)

A

Target outcome is less probable than the alternative

21
Q

What is the convenient way to compare odds of 2 events? (x2)

And what does this tell us? (x1)

A

Take their ratio
ie, divide one by the other

How many times more likely an event is for one group over an other

22
Q

Explain changes in odds related to the b (coefficient) in logistic regression? (x2)

A

In linear, b = change in outcome value

In logistic, b = change in log odds brought by unit change in x

23
Q

Explain how to interpret changes in log odds in logistic regression? (x2)

A

Have to EXPONENTIATE the coefficient

*as multiply/divide, exp/log undo each other

24
Q

Explain what Exp(b) represents in logistic regression? (x3)

A

Rather than a multiplier of x (as in linear regression)
Exp(b) is the multiplier/proportion of change on the old odds
ie, what we need to multiply old odds by to get new odds

25
Q

What is the impact of exp(b) on successive odds calculations? (x1)

Which is handy, as… (x1)

A

For each unit change in x, you get diminishing returns in change in new odds

This is what gives us the characteristic s-shape curve

26
Q

Despite technical changes from direct predictions to changes in odds, linear and logistic regression remain conceptually similar in that… (x1)

A

Coefficients reflect predictive utility of our predictor variables

27
Q

What are the 2 key questions in evaluating a logistic regression model?

Which are answered with which 2 tests?

A

Does the predictor variable(s) do anything useful?
Does the model provide a good fit to the data?

Significance tests on the coefficients (t-test against b = 0)
Evaluation of R2 via F test against R2 = 0 (model explains zero variance)

28
Q

In what 2 ways can the null hypotheses for testing b (the significance of the coefficient) be expressed in logistic regression?

A

b coefficients = 0: No change in log odds with increases in predictor
Exp b = 1: No proportionate change in odds with increases in predictor

29
Q

What 2 specific tests are used to evaluate coefficients in logistic regression? (plus explain/interpretation, x3, x3)

A

Wald test:
Form of chi-square testing b (change in log odds)
*significance means reject the null - evidence for predictive utility

95% CIs:
Interval we are 95% sure contains true value of Exp b
*If includes the value 1, evidence of no change in odds for predictor change

30
Q

What 2 tests evaluate R-square (model fit) in logistic regression?

And we should… (x1)
Because…(x1)

A

Cox & Snell
Nagelkerke

Report both
As first is conservative and second is liberal

31
Q

How do we assess model accuracy (ie, mispredictions) in logistic regression? (x2)

A

Omnibus test of model coefficients

Hosmer and Lemeshow test

32
Q

What is involved in omnibus test of model coefficients in logistic regression? (x3)

A

Chi-square test of whether all predictors combined account for any variance

Test against the H0 that R2 = 0

Significant result implies model does better than absolutely terrible!

33
Q

What is involved in the Hosmer and Lemeshow test of R-square in logistic regression? (x3)

A

Chi-square test of how closely model predicts outcome categories

Test against the H0 that predictions are perfect

Significant result implies discrepancies between model and data

34
Q

How do we assess the proportion of correct classifications by a logistic regression model? (x4)

A

Cases originally all placed in most common category
*% correctly classified reported
Then predicted classifications are compared to empirical data
*% correctly classified reported

35
Q

What 3 stages of info does SPSS output give us for logistic regression?

A

Preliminary info
Block 0 output
Block 1+ output

36
Q

What preliminary info should we double check in SPSS output for logistic regression?

A

How outcome categories have been coded

37
Q

What info is given in Block 0 output in SPSS for logistic regression? (x3)

A

All cases placed in most frequent category in data
Gives baseline model (no predictors) for comparison with more complex models
*Not theoretically interesting, but important

38
Q

What info is given in Block 1+ output in SPSS for logistic regression? (x3)

A

Summary and tests of R-square
Tests of variables in Logistic Regression equation (e.g., coefficients)
Info on classification accuracy

39
Q

What are the 3 methods of conducting logistic regression?

A

Direct or Enter method
Sequential logistic regression
Stepwise logistic regression

40
Q

Describe the Direct or Enter method of logistic regression (x3)

A

As linear MR - all predictors entered simultaneously (ie, Block 1)
Used to evaluate relative strength of predictors
Doesn’t test hypotheses about order/importance of each predictor

41
Q

Describe the Sequential logistic regression method (x3)

A

As HMR - researcher chooses order of entering predictors in separate blocks
Determines predictive value of each variable in context of whole model
First predictor explains max possible variance, more added if they improve fit

42
Q

Describe the Stepwise logistic regression method (x2)

Which is done in what 2 ways?

A

Predictors entered sequentially, included in model on statistical grounds
Often more exploratory/for hypothesis generation

Forward method - no predictors, progressively weaker predictors added
Backward - start with full model, progressively stronger predictors removed

43
Q

How is logistic model fit evaluated during stages of Stepwise regression? (x3)

A

Does adding this predictor significantly improve model fit?
Does removing this predictor significantly harm model fit?
Evaluated via nested model comparisons