LOGISTIC REGRESSION Flashcards

Question 1

Q

What is logistic regression?

Answer

A

Logistic regression is used to predict non-continous variables
- Also known as Logit Analysis
- Can be multinomial, ordinal or binary.
- Similar to discriminant analysis (in MANOVA) but differs in terms of assumptions so not interchangeable
Logistic regression doesn’t try to predict an outcome score. Rather, it predicts the probability an event will occur given the predictive values.
Predicts outcome by creating a variate comprised of the IVs
- Variate = measure comprised of 2+ variables

Question 2

Q

What are the advantages of logistic regression?

Answer

A

Doesn’t require many assumptions to be met
- Doesn’t require:
  - Normality
  - Linearity
  - Homoescadescity
- Although they do increase predictive power
Can be interpreted in a similar way to multiple regression
Forced, Hierachical and stepwise methods all available

Question 3

Q

What are the disadvantages of logistic regression?

Answer

A

Still has some assumptions
- Independence of errors,
- Linearity of logit,
- Absence of outliers
Need strong theoretical justification for predictors
Causality cannot be established
Requires large sample size
Problems with model overfit/complete separation

Question 4

Q

What is the odds ratio?

Answer

A

The odds ratio (ExpB) tells you how a 1 unit change in the predictor will affect the probability of the outcome occuring
- Ratio = ( odds after unit change) / (original odds)
- >1 = odds of outcome increase < 1 = odds of outcome decrease
- If confidence interval crosses 1 the ratio isn’t statistically significant
Unadjusted vs Adjusted Odds Ratio:
- UOR: not adjusted for presence of other predictors
- AOR: Represents association when other variables are held constant

Question 5

Q

What are Model Parsimony and Linearity of the Logit?

Answer

A

Model Parsimony; A parsimonious model is one in which minimal predictor variables that together maximally explain the outcome variable
- Select and use only those predictors that are likely to explain the outcome
Linearity of the Logit; a linear relationship between the continuous predictors and the log transformation of the outcome variable
- log transformation means that the probabilities remain between 0 and 100%.

Question 6

Q

What are log likelihood and deviance?

Answer

A

Log Likelihood is equivalent to the SSR (sum square residuals)
- Log Likelihood: Compared predicted and actual probability
  - Large value = poor fit, small value = good fit
The Deviance score (-2LL) is used to compare model parsimony and to calculate R2
- Chi-square distribution of log likelihood

Question 7

Q

What are the different versions of R² in logistic regression?

Answer

A

R² is a measure of variance explained; all are derived from Deviance Stat
- In log regression cannot simply square R statistic
Homer and Lemenshow: Orders data by group and compares to prediction using CHISQ dist
Cox and Snell: uses sample size, used by SPSS
- Neve reaches theoretical max so limited in high end
Nagelkerke: moderated Cox and Snell to fix upper limit

Question 8

Q

What is the Wald Statistic?

Answer

A

Wald Statistic (z statistic)
- Logistic regression equivalent of the t statistic)
- SPSS reports as z² to get a chisquare distribution
Tells us whether the contribution is significant
- Be cautious; when b is large SE becomes inflated
- More accurate to add in hierachically and examine change in Likelihood stats
- Check if CI crosses 0

Question 9

Q

How is logistic regression accomplished in SPSS?

Answer

A

Correlate -> Bivariate -> Add all variables
- Select potential predictors
- Be careful with negative predictors (cancel out positive predictors)
Analyse -> Regression -> Binary Logistic
- Outcome in Dependent, Predictors in covariate
- Choose ‘enter’ method (unless hierachical is warranted)
- If categorical predictor present -> categorical -> move predictor into box
- Save; group membership
- Options; Hosmer, CI, Classification

Question 10

Q

How do you interpret logistic regression output in SPSS?

Answer

A

Check which cases are included under case processing
Block 0; Null hypothesis model
- Probability without predictors
Variables not in equation; shows prediction outside model
Block 1; Simultaneous model
- Omnibus test compares to Block 0 (p<.05 =good predictor)
- Nagelkerke = variance explained
- Homer-Lemenshow < .05 = good
- Exp(B)= odds ratios
Contingency tables; Shows how many cases were correctly predicted
Classification tables; % correct predicted
- compare to Null model

Question 11

Q

How is model parsimony tested in SPSS?

Answer

A

Model Parsimony; testing during main analysis
- Add the different predictors in steps
- Under Categorical; tick change and contrast
- Under omnibus tests; compare the blocks and rerun only the best model

Question 12

Q

What are some common problems in logistic regression?

Answer

A

Overdispersion; the variance is larger than expected from the model
- Makes SE/CIs too small
- Caused by violating independence of errors assumption
- Present if Dispersion Parameter is greater than 1 ( big problem if over 2)
Incomplete Information from Predictors;
- Ideally, you should have some data for every possible combination of predictors (definitely for categorical)
- Violation causes large SEs
Complete Separation; when outcome can be perfectly predicted by 1+ predictor
- model collapses, large SEs
  *

LOGISTIC REGRESSION Flashcards

(12 cards)