M2 - Logistic Regression Flashcards
What differentiates LR from MR? pick best answer.
- LR is used for predicting group membership.
- LR only uses binary independent variables.
- LR has a dependent variable that is binary.
- LR output are graphs that have a straight line for the line of best fit.
- LR has a dependent variable that is binary.
Part B - Question 2: The Independent variables in a Logistic Regression should be:
- Binary.
- Continuous or binary.
- Ordinal.
- Continuous or metric.
- Metric.
- Continuous or Binary
Part B - Question 3: What are key differences between ANOVA and logistic regression?
- The DV is binary for LR and not for ANOVA.
- The DV is binary for both LR and ANOVA.
- The DV is continuous for ANOVA and not for LR.
- The IV and DVs are binary for both LR and ANOVA
- The DV is binary for LR and not for ANOVA.
Part B - Question 4: Is it possible to use a continuous variable for use in Logistic regression?
- Continuous dependent variables can be used in a regression analyses if the scale includes a 1 and zero in the metric scale.
- No, a dependent variable that is continuous can only be used in a multiple regression analyses.
- Only if the continuous variable has more than 2 values.
- A cut-point can be identified on a continuous variable, and this can be used to form a binary variable.
- A cut-point can be identified on a continuous variable, and this can be used to form a binary variable.
Part C - Question 1: What is the general shape of a plotted logistic regression formula?
- A straight line.
- A parabola.
- An S shape.
- A U shape
- An S shape.
Part C - Question 2: What is the scale of the DV for the logistic regression model?
A parabolic scale.
A decimal scale.
A hexadecimal scale.
A logarithmic scale
A logarithmic scale
Part C - Question 3: Is binary logistic regression a linear model?
Yes, it has the function of y=c+mx.
Yes, it has the formula of y=bx+c
Both a and b.
No, it is non-linear function.
No, it is non-linear function.
Part D - Question 1: Can the approach used for MR model building be used for LR?
No, it can only use, standard model building.
No, it must use no- linear strategies.
Yes, it can use standard, sequential and statistical model building.
Yes, it must use ordinal strategies
Yes, it can use standard, sequential and statistical model building.
Part D - Question 2: How many ordinal outcomes can a multinomial logistic regression predict?
One continuous category.
Three or more categories.
One multivariate variable.
Two categories
Three or more categories.
Part D - Question 3: Which type of LR is the 4th year lecture covering?
Multinomial.
Ordinal.
Binary.
Binary
Part D - Question 4: Sequential logistic regression is what?
Binary variables are randomly entered in blocks.
Binary and continuous variables are entered in blocks and pre-specified by the researcher.
Binary variables are entered in blocks and pre-specified by the researcher.
Binary and continuous variables are randomly entered in blocks
Binary and continuous (predictor) variables are entered in blocks and pre-specified by the researcher.
Part E - Question 1: What does target variable mean?
This is the independent variable with the most number of categories.
This is the outcome category of the dependent variable that is the focus of the research question.
This is the dependent variable with the least number of categories.
This is the variable that can be used as a independent or dependent variable
This is the outcome category of the dependent variable that is the focus of the research question.
Part E - Question 2: What is the reference category in logistic regression analyses?
It is the reference category used to interpret the categories of a categorical variable.
It is the reference for logistic regression.
It is the same as the Target category.
The variable that the research question is focussed on
It is the reference category used to interpret the categories of a categorical variable.
Part E - Question 3: How does SPSS choose the target category for an outcome variable in a logistic regression?
It chooses the highest numeric value.
It chooses the lowest numeric value.
SPSS does not choose the DV target category, the analyst always needs to select this to run the analysis
It chooses the highest numeric value.
Part E - Question 4: Can a categorical DV have more than one category in logistic regression?
Categorical DVs in a logistic regression must be continuous.
Categorical DVs in logistic regression can only have two values.
Categorical DVS in a logistic regression must be ordinal.
Categorical DVs in a logistic regression can have two or more values.
Categorical DVs in logistic regression can only have two values.
Part E - Question 5: Can a categorical IV have more than one category?
Categorical IVs in a logistic regression can have two or more values.
Categorical IVS in a logistic regression must be ordinal.
Categorical IVs in a logistic regression must be continuous.
Categorical IV variables in logistic regression can only have two values.
Categorical IVs in a logistic regression can have two or more values.
Part E - Question 6: For an IV with more than one category, can any category be the reference category?
Yes, any category can be the reference variable.
No, SPSS will force a reference category, and this cannot be changed.
No, the lowest category should be the reference category.
No, the highest category should be the reference category
Yes, any category can be the reference variable.
Part G - Question 1: When checking the linearity assumption in logistic regression we are doing the following:
There is linear relationship between the independent variables.
There is log linear relationship between continuous independent variables and the dependent variable.
There is a log linear relationship between categorical variables.
There is a linear relationship between all independent variables and the dependent variable
There is log linear relationship between continuous independent variables and the dependent variable.
Part G - Question 2: What statistic is recommended to interpret model fit and can also be used as pseudo measure of R2?
The Cox and Snell statistic.
The Chi-square classification.
The Wald statistic.
The -2 Log Likelihood
The -2 Log Likelihood
Part G - Question 3: An odds ratio > 1 means what?
Outcome is more likely for target level of IV.
No difference between groups.
Outcome is less likely for target level of IV
Outcome is more likely for target level of IV.
Part G - Question 4: An odds ratio < 1 means what?
Outcome is more likely for target level of IV.
Outcome is less likely for target level of IV.
No difference between groups
Outcome is less likely for target level of IV.
Part G - Question 5: An odds ratio = 1 means what?
No difference between groups.
Outcome is more likely for target level of IV.
Outcome is less likely for target level of IV
No difference between groups.
Part G - Question 6: The odds ratio is also referred to as what?
Tetrachoric correlation.
Exp(B).
The standard error.
Wald statistic
Exp(B).
Part G - Question 7: Odds ratios should always be interpreted in the context of what?
Sample size.
Number of correct classifications.
Number of false classifications.
Prevalence
Prevalence
In what circumstances should logistic regression be used?
- predicting group membership
- DV is binary / categorical
- continuous variables can be converted to binary using cut offs
- Multiple IVs can be categorical or continuous
Outline the differences between logistic regression and multiple regression in terms of
- types and number of IVs and DVs
LR
IVs - multiple continuous or categorical
DVs - single binary or categorical (binomial, multinomial or ordinal)
MR
IVs - multiple continuous or categorical
DVs - single continuous
Outline the differences between logistic regression and multiple regression in terms of
- approach for entering variables
LR
Standard (entered altogether)
Sequential (entered in blocks) - theory directed
Statistical (forward to backward)
MR
Standard (forced entry)
Hierarchical (entered in blocks) - theory directed
Stepwise (statistics based, forward or backward) - only user for exploration
Outline the differences between logistic regression and multiple regression in terms of
- assumptions
LR
- Independence of errors - errors should not be correlated (clustered data)
- linearity - IV should have linear relationship with log of the DV
- distribution normality - distribution should be normal and outliers should be dealt with (transformed or removed) using Standardised residuals Cook Distance
- samples size –> 5 cases per possible combination required
- singularity and multicollinearity
How does the logistic function differ compared to other functions
log function = log(Y/1-Y) = b0 + b1x1 + b2x2 + e log function is an exponential function that is shaped liked an S curve \+ve coefficients will increase Y -ve coefficients will decrease Y interpret using Odds Ratio
linear function = Y = bx + c
linear function is a straight line
as DV increase by 1 unit IV increases by b units
quadratic function = Y = ax2 + bx + c quadratic function is a parabola \+ve a is happy face -ve a is sad face
When is categorical coding useful?
Useful to deal with categorical variables with 2 or more outcomes
What is binomial, multinomial and ordinal categorical coding?
Binomial is 2 (Yes/No)
Multinomial is for nominal groups eg brown = 1, blue = 2, green = 3
Ordinal is multinomial moving in a progressive way eg education level achieved 1 = high school, 2 = grad school, 3 = postgrad
Which group of DV category should be the referent in binomial logistic regression and how should it be coded?
Referent group should be coded lower and should be the group you that is not the target of interest ie control group
For IVs with 2 and 3 categories, how should they be coded?
Binomial categorical IV -SPSS will assign automatically
- target should be the group of interest
- referent should be the group not of interest
Multinomial categorical IVs
- can be anyway
- needs to make sense
- set the referent as the variable you are most interested in so other groups can be compared directly to that one
What does the choice of referent group in categorical coding impact?
interpretation of the DV
interpretation of the coefficient o the IVs
Name the model overall approaches for logistic regression interpretation
Model Improvement
- 2LL change
- 2LL proportion (% improvement of model fit)
Classification Accuracy
% correct
% improvement relative to baseline
Describe model improvement methods for interpreting LR
-2LL change = -2LLbase - -2LLnew = -2LL change (Omnibus test) used for nested models only significant at p =.05 if 1 df > 3.84
-2LL proportion
= x2 model / -2LLbase
then transform to improvement of model fit
Describe classification accuracy methods for interpreting LR
Classification accuracy
1. % correct
= the # of correctly predicted to be in one group and not in another group
- % improvement
= the # of correctly predicted relative to if everyone was predicted to be in the category with the most outcomes
= hits + correct rejection -nmax / sample n - nmax
then transform to % for model improvement over baseline
Name and explain the individual predictors of LR interpretation
b weight = change in log odds of Y =1 for 1 unit change in the IV
Odd Ratio = change in likelihood of Y = 1 for 1 unit change in IV
<1 = less likely, > 1 is more likely, 1 = no difference
Significance test - Wald’s test
Where do you find the log odds and odd ratio in SPSS output?
log odds - b weight = B
For every unit increase level of group, the log odds of being in the outcome increase by B units
Odd ratio - Exp(B)
likelihood of increase relative to referent group