Wk 7 - Logistic Regression Flashcards by Emma Richmond-Darvill

What is the formula for linear regression? (x1, plus define components)

y’ = bx + c

predicted y = slope times x + constant (y intercept)

How well did you know this?

Not at all

Perfectly

What does b signify in regression formulas? (x3)

Slope
Coefficient
Amount of change in y for every unit change in x

How well did you know this?

Not at all

Perfectly

How is the fit of a regression line maximised/evaluated? (x2)

Least squares criterion:

Want minimal residuals (diff between scores and line)

How well did you know this?

Not at all

Perfectly

What 2 questions can we ask of any given regression model?

Q1: Does the predictor variable do anything useful?

Q2: Does the model provide a good fit to the data?

How well did you know this?

Not at all

Perfectly

What can we conclude if b = 0 in a regression model? (x1)

Changes in x produce no change/effect in y

How well did you know this?

Not at all

Perfectly

How do we assess model fit in linear regression? (x2)

Calculate r-square (proportion of variance accounted for)

And test for significance

How well did you know this?

Not at all

Perfectly

How are b and r-square related in linear regression? (x3)

Generally linked to some degree,
But if you move all scores similar distances from line,
b stays the same while r-square goes to hell

How well did you know this?

Not at all

Perfectly

What is the major limitation of linear regression method? (x2)

Can’t deal with categorical data

‘All or nothing’ scores, rather than continuous predictions/outcomes available

How well did you know this?

Not at all

Perfectly

How does what we are trying to predict change when using categorical rather than continuous DV/y variable? (x2)

Want to assess the change in PROBABILITY of y given b change in x
Rather than change in y scores

How well did you know this?

Not at all

Perfectly

What statistical method enables regression with 2 categorical outcomes? (x1)

Binary logistic regression

How well did you know this?

Not at all

Perfectly

In linear regression:
Predictors are continuous or categorical
Outcome is continuous
Predictors assumed normally distributed
Deals with linear relationships among variables

Whereas in logistic regression? (x4?

Predictors are continuous or categorical
Outcome is categorical
Predictors not assumed normally distributed
Deals with non-linear relationships among variables

How well did you know this?

Not at all

Perfectly

What are the 2 applications/questions of logistic regression? (x1, x2)

Predict category people belong to, given predictors
Identify predictors of particular (categorical) outcome variable
*Outcomes are exhaustive and mutually exclusive

How well did you know this?

Not at all

Perfectly

How does the linear regression model change for logistic regression? (x3)

y’ becomes a logistic function (s-shaped curve) =
1 divided by
e raised to the power of the linear equation (v)

How well did you know this?

Not at all

Perfectly

In logistic regression, if we substitute our largest x value for v… (x1)
And if v is very small… (x1)

y gets close to zero

y gets large

How well did you know this?

Not at all

Perfectly

What is the statistical question asked by logistic regression? (x2)

How many units change in x does it take

To shift the odds from favouring particular category of y?

How well did you know this?

Not at all

Perfectly

What are odds? (x1)

Expression of relative probability of an event happening vs. not happening

How well did you know this?

Not at all

Perfectly

How are odds calculated? (x1)

What happens to probability if you double the odds of an event occurring? (x2)

odds = p(event)
Divided by 1 - p(event)

Increases, but with diminishing returns

How well did you know this?

Not at all

Perfectly

Interpreting odds:

If odds = 1… (x1)

Two outcomes are equally likely

Interpreting odds:

If odds > 1… (x1)

Target outcome is more probable than the alternative

Interpreting odds:

If odds < 1… (x1)

Target outcome is less probable than the alternative

What is the convenient way to compare odds of 2 events? (x2)

And what does this tell us? (x1)

Take their ratio
ie, divide one by the other

How many times more likely an event is for one group over an other

Explain changes in odds related to the b (coefficient) in logistic regression? (x2)

In linear, b = change in outcome value

In logistic, b = change in log odds brought by unit change in x

Explain how to interpret changes in log odds in logistic regression? (x2)

Have to EXPONENTIATE the coefficient

*as multiply/divide, exp/log undo each other

Explain what Exp(b) represents in logistic regression? (x3)

Rather than a multiplier of x (as in linear regression)
Exp(b) is the multiplier/proportion of change on the old odds
ie, what we need to multiply old odds by to get new odds

What is the impact of exp(b) on successive odds calculations? (x1) Which is handy, as... (x1)

For each unit change in x, you get diminishing returns in change in new odds This is what gives us the characteristic s-shape curve

Despite technical changes from direct predictions to changes in odds, linear and logistic regression remain conceptually similar in that... (x1)

Coefficients reflect predictive utility of our predictor variables

What are the 2 key questions in evaluating a logistic regression model? Which are answered with which 2 tests?

Does the predictor variable(s) do anything useful? Does the model provide a good fit to the data? Significance tests on the coefficients (t-test against b = 0) Evaluation of R2 via F test against R2 = 0 (model explains zero variance)

In what 2 ways can the null hypotheses for testing b (the significance of the coefficient) be expressed in logistic regression?

b coefficients = 0: No change in log odds with increases in predictor Exp b = 1: No proportionate change in odds with increases in predictor

What 2 specific tests are used to evaluate coefficients in logistic regression? (plus explain/interpretation, x3, x3)

Wald test: Form of chi-square testing b (change in log odds) *significance means reject the null - evidence for predictive utility 95% CIs: Interval we are 95% sure contains true value of Exp b *If includes the value 1, evidence of no change in odds for predictor change

What 2 tests evaluate R-square (model fit) in logistic regression? And we should... (x1) Because...(x1)

Cox & Snell Nagelkerke Report both As first is conservative and second is liberal

How do we assess model accuracy (ie, mispredictions) in logistic regression? (x2)

Omnibus test of model coefficients | Hosmer and Lemeshow test

What is involved in omnibus test of model coefficients in logistic regression? (x3)

Chi-square test of whether all predictors combined account for any variance Test against the H0 that R2 = 0 Significant result implies model does better than absolutely terrible!

What is involved in the Hosmer and Lemeshow test of R-square in logistic regression? (x3)

Chi-square test of how closely model predicts outcome categories Test against the H0 that predictions are perfect Significant result implies discrepancies between model and data

How do we assess the proportion of correct classifications by a logistic regression model? (x4)

Cases originally all placed in most common category *% correctly classified reported Then predicted classifications are compared to empirical data *% correctly classified reported

What 3 stages of info does SPSS output give us for logistic regression?

Preliminary info Block 0 output Block 1+ output

What preliminary info should we double check in SPSS output for logistic regression?

How outcome categories have been coded

What info is given in Block 0 output in SPSS for logistic regression? (x3)

All cases placed in most frequent category in data Gives baseline model (no predictors) for comparison with more complex models *Not theoretically interesting, but important

What info is given in Block 1+ output in SPSS for logistic regression? (x3)

Summary and tests of R-square Tests of variables in Logistic Regression equation (e.g., coefficients) Info on classification accuracy

What are the 3 methods of conducting logistic regression?

Direct or Enter method Sequential logistic regression Stepwise logistic regression

Describe the Direct or Enter method of logistic regression (x3)

As linear MR - all predictors entered simultaneously (ie, Block 1) Used to evaluate relative strength of predictors Doesn't test hypotheses about order/importance of each predictor

Describe the Sequential logistic regression method (x3)

As HMR - researcher chooses order of entering predictors in separate blocks Determines predictive value of each variable in context of whole model First predictor explains max possible variance, more added if they improve fit

Describe the Stepwise logistic regression method (x2) Which is done in what 2 ways?

Predictors entered sequentially, included in model on statistical grounds Often more exploratory/for hypothesis generation Forward method - no predictors, progressively weaker predictors added Backward - start with full model, progressively stronger predictors removed

How is logistic model fit evaluated during stages of Stepwise regression? (x3)

Does adding this predictor significantly improve model fit? Does removing this predictor significantly harm model fit? Evaluated via nested model comparisons