Binary Logistic Regression Flashcards
What is a contingency table?
Summarises frequency distribution of each of two categorical variables and the association between them
What can a contingency table allow us to check?
▪️How each potential explanatory variable is related to the DV
▪️The categories for explanatory variables are large enough (at least 5)
▪️How many missing cases for each
What is the risk of an outcome?
The number of times the outcome of interest occurs divided by the total number of possible outcomes
How do you calculate the risk ratio (relative risk)?
Risk of one outcome divided by risk of the other outcome
What does a risk ratio of 1 indicate?
No difference between the groups
How do you calculate the odds of an outcome?
The number of times it occurs divided by the number of times it doesn’t occur
How do you calculate the odds ratio?
Odds of outcome in one group compared to odds of outcome in the other group
What does an odds ratio of 1 mean?
▪️Outcome occurs half of the time
▪️Exposure doesn’t affect the odds (no difference between groups)
What does an odds ratio <1 indicate?
Outcome occurs less than half the time
Indicates exposure is associated with reduced risk of developing outcome
What does an odds ratio >1 indicate?
Outcome occurs more than half the time
Indicates exposure is associated with increased risk of developing outcome
Which ratio can be used for case-control studies?
Odds ratio
When can you NOT use risk ratios and why?
In case control studies where selection of subjects is based on the outcome
What happens to RR and OR in rare outcomes?
They will be approximately equal
What can you assume about a relationship where the outcome is binary?
It is nonlinear, S-shaped (sigmoid) relationship
Linear cannot make sense!
▪️Outcome is limited to 0,1
▪️Goal is to separate groups, not minimise mean squared error
▪️Linear regression is highly sensitive to influential cases
▪️Assumptions of linear are violated, particularly homogeneity of variance
What is the link function?
A way of relating x and Y by way of a function in a logistic regression
To model the relationship between left and right hand side of the equation
g(y) = β0 + β1x1 + ε