Ordinal Logistic Regression Flashcards

Question 1

Q

What is ordinal logistic regression (OLR)?

Answer

A

Summary term for several types of ordinal outcomes
- A regression model used when the dependent variable has three or more ordered categories
- Estimates the relationship between predictors and an ordinal outcome using log odds

Question 2

Q

Proportional Odds Model (POM)

Answer

A

Assumes that ORs are the same across all cut-off points of the ordinal outcome i.e., the observed ORs are estimates of the same “true” OR.
Expressed as:
logit [ P ( Y > j ) = cj + β1X1 + β2X2 + … + βkXk
- Contains the coefficients estimated being the slopes (β1, β2, … βk)
- Contains the OR for a unit increase in x1 being OR = exp(β1) - this is true if x1 is continuous. For a binary/dummy outcome variable, the OR compares a specific group to the reference group.
Also known as the parallel regression assumption

Question 3

Q

Stata commands for OLR

Answer

A

gologit2 <outcome> <predictor(s)>, pl or - proportional odds model
gologit2 <outcome> <predictor(s)>, or - non-proportional odds model
ologit <outcome> <predictor(s)> - for Brant's test</outcome></outcome></outcome>

Question 4

Q

Non-Proportional Odds Model

Answer

A

Does not assume equal ORs across different dichtomisations
Stata models each dichotimisation separately. This gets the same results as if we performed multiple BLRs (one for each possible dichtomisation)

Question 5

Q

Brant’s test for proportional odds assumption:

Answer

A

H0: The proportional odds assumption holds
If p > 0.05, we assume proportional odds
Run in Stata using:
ologit <outcome> <predictor(s)> ///
brant, detail</outcome>

Question 6

Q

How would you transform a continuous predictor for non-linearity?

Answer

A

Centring at the mean: gen <varname> = <predictor> - <mean>
Quadratic term may be added for a potential U-shaped relationship: gen <varname> = centred_predictor^2</varname></mean></predictor></varname>

Question 7

Q

LRT for model comparison:

Answer

A

Compares a simpler model (e.g., linear age effect) with a more complex model (e.g., quadratic age effect)
If p > 0.05, the complex model fits better

Question 8

Q

Partial proportional odds model

Answer

A

Relaxes proportional odds assumption for specific variables while keeping it for others
In Stata:
gologit 2 <outcome> <predictor(s)>, or pl(<predictors></predictors></outcome>

Question 9

Q

Model selection consideration

Answer

A

Use Brant’s test for proportional odds assumption
Use LRT to compare nested models
Consider adding interaction terms or nonlinear transformations

Question 10

Q

How is OLR related to BLR?

Answer

A

A logit transformation (log odds) is used (on the left-hand side of the equation)
The measure of effect size is the OR

Question 11

Q

What possible ways can depression be dichotomised if categorised as ‘none’, ‘moderate’ or ‘severe’?

Answer

A

Cut-off 1: None / Moderate or severe
Cut off 2: None or moderate / severe
2 dichotomisations

Question 12

Q

How many ways can depression be dichotomised if there are four categories: ‘none’, ‘mild’, ‘moderate’, ‘severe’?

Answer

A

Three ways:
- Mild/moderate/severe vs none
- Moderate/severe vs mild/none
- Severe vs none/ mild/moderate

Question 13

Q

In general, what is the number of possible dichotomisations equal to?

Answer

A

The number of categories minus one

Question 14

Q

What does OLR do with all the dichotomisations of an outcome?

Answer

A

OLR dichotomises the ordinal outcome in all possible ways, and models the log odds of being in a higher outcome category
I.e., in the context of depression, it compares ‘none’ to ‘moderate’/’severe’ (higher categories) or ‘none’/’moderate’ to ‘severe’ (higher categories)

Question 15

Q

What is the proportional odds assumption?

Answer

A

If we can assume the ORs are the same in all possible cut-offs, we only need to estimate one (common) OR for all cut-offs
E.g., with depression, if the ORs are the same for cut-off 1 (‘moderate’ or ‘severe’ vs. ‘none’) and cut-off 2 (‘severe’ vs. ‘none’ or ‘moderate’

Question 16

Q

If the ORs are different across dichotimisations, does this necessarily mean they are not proportional?

Answer

A

No, the ORs may be different but can be estimating the same thing

Question 17

Q

When using a non-proportional odds model for modelling the log-odds of the outcome depression (three categories: ‘none’, ‘moderate’, ‘severe’), how do we interpret the output?

Answer

A

Output is split into two tables: ‘none’ and ‘moderate’
First table predicts the odds of being in a higher than ‘none’ category (‘moderate’/’severe’ depression vs. ‘none’)
Second table predicts higher than ‘moderate’ depression - ‘severe’ depression vs. ‘none’/’moderate’

Question 18

Q

What does Stata display in the output in a proportional odds model?

Answer

A

At the top, the constraint (“let the two ORs be the same”)
The OR estimates would be the same - due to the decision to constrain them to be equal

Question 19

Q

Unlike BLR, what does OLR estimate?

Answer

A

The odds of being in a higher category than ‘none’

Question 20

Q

How would we report ORs if we thought the non-proportional odds model was true?

Answer

A

Separately

Question 21

Q

What’s the equation for a proportional odds model?

Answer

A

Consider an ordinal outcome, y, with j categories, labelled j = 1, 2, …, J
Let Pj = P(y > j) be the probability of being in a category higher than j
The proportional odds model is:
logit (pj) = cj + β1X1 + β2X2 + … + βkXk

Question 22

Q

How does the proportional odds equation differ to that of the BLR?

Answer

A

We now have ‘pj’ in the logit transformation.
There is one coefficient associated with each predictor (like in logistic regression). However, in logistic regression, we have only one intercept term (β0). In the proportional odds model, we have several intercepts, cj, which correspond to all possible cut-offs. Other coefficients would be the same under the proportional odds assumption

Question 23

Q

In a proportional odds model, would be the separate equation if J = 3?

Answer

A

logit(p1) = c1 + β1x1 + β2x2 + … + βkxk
logit(p2) = c2 + β1x1 + β2x2 + … + βkxk
The only differences between the right-hand sides of the equation is the intercept (c1 and c2). All slope coefficients β1, β2, etc. are the same in both equations
- The coefficients estimated are the slopes β1, β2, … βk and the cut-offs c1, c2, …, cj-1
- As in BLR, the OR for a unit increase in x1 is ORx1 = exp(β1). Like in BLR, you can get the OR by exponentiating Beta coefficients

Question 24

Q

What would be the equation for proportional odds model estimating the log-odds of depression with three categories: ‘none’, ‘moderate’, ‘severe’?

Answer

A

logit(P(Depression > None)) = c1 + β1_Female
logit(P(Depression > Moderate)) = c2 + β1_Female

Question 25

Q

How do you obtain coefficient estimates in OLR in a proportional odds model?

Answer

A

gologit2 <outcome> <predictor(s)>, pl
Log odds would be the same in all part of the table
Intercepts for each dichotomisation vary
'pl' stands for parallel lines</outcome>

Question 26

Q

Calculating predicted probabilities in OLR:

Answer

A

Just like in BLR, we can use the estimated coefficients to calculate predicted probabilities for sample members with any combination of covariate values
In OLR (proportional and non-proportional odds models) we can calculate the probability of being in any one of the outcome categories

Question 27

Q

What is the difference in the equations for non-proportional odds model compared to proportional odds model e.g., if J = 3?

Answer

A

Two different dichotomisations, with different slope coefficients, probabilities and intercepts in each equation
logit(p1) = c1 + β11x1 + β21x2 + … + βk1xk
logit(p2) = c2 + β12x1 + β22x2 + … + βk2x2

Question 28

Q

What do we omit in the Stata command to get a non-proportional odds model?

Answer

A

‘pl’ option

Question 29

Q

What do we do with the outcome variable before computing any OLR model?

Question 30

Q

What would the equation for an OLR predicting life satisfaction without any predictors look like?

Answer

A

Null model:
logit[P(Lifesat > j)] = cj
j = {low, medium}
We just have the intercept cj for dichotomisation j

Question 31

Q

What is the utility of a null model?

Answer

A

We’re not usually interested in the null model for its own sake, but serves as a comparison point for other models

Question 32

Q

What is given in the output for a non-proportional odds model (null model)? E.g., considering life satisfaction with J = 3 (‘low’, ‘medium’, ‘high’)

Answer

A

Table 1: Baseline log odds of being in a higher than ‘low’ category
Table 2: Baseline log odds of being in a higher than ‘medium’ category

Question 33

Q

In the null model, what are the estimated intercepts (cut-offs) equal to?

Answer

A

The log odds of the observed proportions in the dataset. E.g., with life satisfaction (J = 3, taking the intercept for ‘higher than low’:
P(Lifesat > “Low”) = exp(β0 for low) / 1 + exp(β0 for low)
The result of the reverse logit transformation is the proportion of the sample reporting higher than low life satisfaction

Question 34

Q

How should you initially judge linearity between continuous predictors and an ordinal outcome?

Answer

A

Graphically illustrate the relationship between the continuous predictor and ordinal outcome. This can be done by, for example, plotting predicted probabilities, with curvatures indicating non-linearity.
The continuous predictor e.g., age with a mean of 50 should also be centred (normally around the mean) so the intercepts can be interpreted as odds for those aged 50

Question 35

Q

In a proportional odds model, how many interpretations are there per independent variable?

Answer

A

One interpretation per independent variable
If there are other covariates in the model, we would also be controlling for other independent variables

Question 36

Q

Why is it worth checking the sample sizes before doing an LRT?

Answer

A

To ensure missing data do not affect number of observations

Question 37

Q

What is log likelihood?

Answer

A

Probability associated with LRT comparing model computed with the null model

Question 38

Q

What are the hypotheses for a LRT in OLR?

Answer

A

H0: None of the independent variables in the current model predicts the DV
H1: At least one of the independent variables predicts the DV
Under H0, the LRT follows a chi-squared distribution with df equal to the number of independent variables in the model

Question 39

Q

What is the LRT statistic?

Answer

A

LRT = -2 x (LLnull - LLcurrent model)

Question 40

Q

LRT in OLRs:

Answer

A

LRTs in OLR are analogous to BLR
By default, Stata displays the LRT comparing the log likelihood (LL) of the estimated model with the LL of the null model

Question 41

Q

How is the LRT statistic calculated?

Answer

A

From the log-likelihoods of the null model and the current model

Question 42

Q

Assumptions of OLR:

Answer

A

The DV is ordinal
For numeric/continuous predictors, the relationship between the IV and the log odds of the outcome is linear
Proportional odds: The coefficient for every IV is assumed to be the same for any dichotomisation of the DV. This is sometimes called the “parallel regression” assumption. The proportional odds assumption can be relaxed in a non-proportional odds model

Question 43

Q

How does Brant’s test work in testing the proportional odds assumption?

Answer

A

Dichotomising the DV in all possible ways
Fitting a BLR on each dichotomisation
Comparing the estimated coefficients from each of these BLRs
If the ORs are different in each dichotomisation, there may be evidence that the odds are not proportional.

Question 44

Q

What would be included in the output of a Brant’s test if the outcome was life satisfaction? (J = 3: ‘low’, ‘moderate’, ‘high’)?

Answer

A

BLR for “Y > 0” (lifesat > ‘low’)
BLR for “Y > 1” (lifesat > ‘moderate’)
BLR coefficients for each dichotomisation
This is the test for each individual IV

Question 45

Q

What is the omnibus test in a Brant’s test?

Answer

A

In the first row of the output, Stata displays an omnibus test. This test the H0 that the odds are proportional for all IVs.
A good strategy is to first look at the omnibus test (“All”). If the result is not significant, we may assume proportional odds. If the result is statistically significant, we look at the individual tests to find out which variable may be problematic

Question 46

Q

What would lead us to conclude no strong evidence against proportional odds assumption in the Brant’s test?

Answer

A

The coefficients are reasonably similar in the BLRs (indicating that the population ORs may be equal)
The Brant test statistics all have large p-values

Question 47

Q

What are some things to be aware of in the Brant’s test?

Answer

A

With very small samples, the test may lack power and fail to detect important departures from the proportional odds assumption
With very large samples, the test may be overly sensitive and detect unimportant departures from the proportional odds assumption
Therefore, you should always inspect the estimated coefficients in the top part of the output as well as looking at the Brant test p-values themselves. Use your judgement in deciding whether the assumption is reasonable

Question 48

Q

What is a way to account for non-linearity in the model?

Answer

A

Adding a quadratic term. This would have its own coefficient in the model (must be included with the centred variable)

Question 49

Q

Why is it important to centre continuous variables before beginning analysis?

Answer

A

Make the interpretation and statistical analysis easier by reducing the chance of multicollinearity
E.g., if there was an age variable of 20-70, which was then squared, the age and age-squared variable may be highly correlated. By shifting age down to the mean (50), there will be negative and positive numbers, making it appear more normally distributed and less likely to be highly correlated with the squared term

Question 50

Q

How can we decide whether or not a quadratic term improves the model?

Answer

A

We can use the LRT - one model with the quadratic term and one without
H0: Model 2 does not add predictive power compared to Model 1
H1: Model 2 predicts the DV better than Model 1 (including a squared term for the IV, alongside the linear effect, improves the prediction)

Question 51

Q

What may happen to the predicted probabilities if we add a squared term?

Answer

A

The relationships are allowed to be curved

Question 52

Q

Why does the non-proportional odds model equation have the additional subscript j?
As in: logit[P(Lifesat > j)] - cj + β1jx1 + β2jx2 + … + βkjxk

Answer

A

To indicate that the coefficients are free to differ between equations

Question 53

Q

What are some disadvantages for the non-proportional odds model?

Answer

A

Inefficient, since proportional odds can safely be assumed for some variables
More complicated to interpret than a proportional odds model, since there is a larger number of parameters (coefficients, ORs)

Question 54

Q

What’s the equation for a partial proportional odds model predicting life satisfaction (predictors: sex (female); age centred; age centred squared; and number of friends)

Answer

A

logit[P(Lifesat > J)] = cj + β1 x Female + β2 x Agecentred + β3 x Agecentred_continuous + β4j x Friends
- The coefficients for β1, β2, and β3 are the same across equations (proportional odds assumed for female, age, and age2)
- The subscript j in β4j indicates that the slope coefficient of friends is free to vary across equations (proportional odds is not assumed for friends)

Question 55

Q

What is the code for storing model results for an LRT to compare all three types of models?

Answer

A

gologit2 <outcome> <predictor(s)>, or pl
est store propodds</outcome>

gologit2 <outcome> <predictor(s)>, or pl(<predictors>)
est store partial</predictors></outcome>

gologit2 <outcome> <predictor(s)>, or
est store noprop</outcome>

lr test partial propodds
lr test noprop partial

Question 56

Q

When comparing all three model types using an LRT, which tests are nested in each other?

Answer

A

Can test partial proportional to proportional odds model. Partial proportional odds model is nested within proportional odds model.
Test non-proportional odds model against partial proportional odds model

Question 57

Q

What would be the output of comparing all three model types? E.g., if the p-values were p = 0.038 for the partial proportional odds model and p = 0.503 for the non-proportional odds model

Answer

A

Stata will give a table with p-values for proportional odds, partial-proportional odds, and the non proportional odds models, comparing the latter two to the first.
- There is evidence that the partial proportional odds model fits the data better than the full proportional odds model (p = 0.038). The partial proportional odds model is best supported by the data.
- There is no strong evidence that the non-proportional odds model improves the fit compared to the partial proportional odds model (p = 0.503).

Question 58

Q

What needs to be balanced when choosing between models?

Answer

A

Model fit: the model should predict the outcome reasonably well; measured by the log likelihood.
Parsimony: a smaller, simpler model is preferred to a larger, more complicated model; measured by the number of parameters (fewer parameters = simple model)
We can use LRTs to compare models. In general, it’s advised to choose the simplest plausible model that fits the data reasonably well

Question 59

Q

What factors determine simplicity in OLR?

Answer

A

Assuming proportional odds leads to a simpler model than non-proportional odds
Assuming linearity (in the log odds) is simpler than using non-linear terms (e.g., age2)
But we should allow for non-proportional odds (e.g., via a partial proportional odds model) and/or non-linearity if we think that this improves the model fit

Question 60

Q

Question 61

Q