Models for Count Data II Flashcards

1
Q

When is Poisson regression appropriate?

A

When the number of events (counts) follows a Poisson distribution, conditional on the predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the ways in which the assumptions for a Poisson regression can be violated?

A
  • Overdispersion (variance > mean)
  • Excess zeroes (more zeroes than in a Poisson distribution)
  • No zeroes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is equidispersion?

A

Assumption for Poisson regression - variance = mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In social science, medicine, and health, in what way do count data violate assumptions for Poisson regression?

A

Overdispersion (variance > mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Models for count data: equidispersion and zeroes as expected

A

Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Models for count data: equidispersion and excess zeroes

A

Zero-inflated Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Models for count data: Equidispersion and no zeroes

A

Zero-truncated Poisson

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Models for count data: Overdispersion and zeros as expected

A

Negative binominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Models for count data: Overdispersion and excess zeroes

A

Zero-inflated negative binomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Models for count data: Overdispersion and no zeroes

A

Zero-truncated negative binomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What about underdispersion?

A

This can occur in principle, but is rare in practice (variance < mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What happens if you use the Poisson distribution even though your data are overdispersed? Or use a model that doesn’t consider excess or no zeroes when it should?

A

Coefficient estimates may be biased and/or misleading (i.e., slope coefficients may not be a good estimate of relationship between predictor(s) and outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the implications for SEs when not considering overdispersion?

A

They may be underestimated. This implies that your p-values would be too small and your CIs to narrow, increasing the risk of Type I error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What choices do you have when outcome is overdispersed?

A
  • Negative binomial regression (or other models accounting for overdispersion)
  • Poisson regression with robust SEs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are robust SEs?

A

Adjusted so they are robust to violations of Poisson regression. Robust SEs are usually larger than those from a typical Poisson regression. Considered a more cautious way of analysing the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the most commonly used overdispersed distribution?

A

Negative binomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the parameters of a negative binomial distribution?

A

Mean, µ, and a dispersion parameter α
The mean and variance are related (as opposed to in the normal distribution where they are independent): var(Y) = µ + αµ^2
In Poisson, we just have one parameter (mean) as variance is equal to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What values can the dispersion parameter α take?

A

Values of 0 or larger (can never be negative)
- if α = 0, we have a Poisson distribution (with equidispersion)
- if α > 0, we have an overdispersed distribution
The larger the α, the larger the variance relative to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Are there other ways of relating the mean to the variance in negative binomial regression?

A

Yes, different ways of relating the variance to the mean can sometimes slightly change the model or slightly improve your model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In the negative binomial distribution, what does larger dispersion imply?

A

Larger variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the shape of the negative binomial distribution?

A

Tails are much larger compared to when dispersion is equal to 0 (Poisson)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Overall comparison of properties of Poisson and negative binomial distributions:

A

Poisson:
- Equidispersed
- One parameter (µ = mean = variance)
- Var(Y) = µ
Negative binomial:
- Overdispersed
- Two parameters (µ = mean; α = dispersion)
- Var(Y) = µ + αµ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In the negative binomial distribution, what is this way of specifying the variance called? Var(Y) = µ + αµ^2

A

NB2-parameterisation
There are other options e.g., the NB1-parameterisation: var(Y) = µ + αµ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Negative binomial regression equation:

A

log(µi) = β0 + β1X1i + β2X2i + … + βkXki
yi ~ NegBin(µi, α), var(yi) = µi + + αµ^2
Where ‘i’ represents each observation
- This looks similar to a Poisson regression. Again, we use a log-transformation of the outcome. The difference is that we now have an additional parameter in the model, the dispersion, α, which we need to estimate. The dispersion parameter governs the extent of overdispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

On what scale are the coefficients from the negative binomial regression?

A

Log-scale. As with Poisson, they can be exponentiated to get the IRRs and 95% CIS for IRRs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Why exponentiate coefficients from the log scale to IRRs?

A

IRRs are more interpretable than coefficients as IRRs are on the scale of the count variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

If coefficient on the log scale is negative, what will the IRR be?

A

One

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

In an output from negative binomial regression, what does /lnalpha mean?

A

Log of the alpha and only needs interpreting if using predictors to predict dispersion rather than assuming it’s constant

29
Q

How can we get the output on the count scale?

A

Using Stata to exponentiate the coefficients

30
Q

Do Poisson and negative binomial regression estimated on the same data give the same results?

A

No - neither estimated coefficients nor SEs will be the same

31
Q

What does zero-truncation often result from?

A

Zeroes being unobservable or ‘impossible’ e.g.,:
- Number of days in hospital for hospitalised stroke patients
- Number of appointments with a psychotherapist

32
Q

How do regression models for zero-truncated data work?

A

Essentially in the same way as ordinary regression models
- There is zero-truncated Poisson and zero-truncated negative binomial regression. In either of these two models, predicted values will have a minimum value of 1. Otherwise, the interpretation of coefficients is the same as in ordinary Poisson or negative binomial regression

33
Q

Does absence of zeroes necessarily imply zero-truncation?

A

No - may not have observed zeroes ‘by chance’, even though zeroes are impossible

34
Q

On what basis is the decision to use a zero-truncated model made?

A

Knowledge about how the data were collected, rather than based on noticing that there are no zeroes

35
Q

With excess zeroes, what are the theoretical bases about the origins of the zeroes?

A
  • Zero-inflation: where zeroes can about in two different ways
  • Hurdle models: Where the zeroes and the non-zero counts are caused by separate processes
36
Q

In what two ways can zeroes come about?

A

Structural zeroes vs sampling zeroes

37
Q

Consider a research example of a zero-inflated distribution: “how many joints (cannabis) did you smoke last week?” The answer is zero for:

A
  • Structural zeroes: non-smokers of joints
  • Sampling zeroes: cannabis users who happened to not smoke last week
38
Q

What are hurdle models?

A

All zeroes are assumed to be structural, and there are no sampling zeroes
The distribution of non-zero counts is zero-truncated, and the zeroes are governed by a totally different process
E.g., number of appointments with a psychotherapist after GP referral:
- Structural zeroes: Some patients never go to see a therapist
- Zero-truncated counts: Those who go to see a therapist have at least one appointments

39
Q

What is a mixture distribution?

A

One distribution governs zeroes and another governs the counts (zero-truncated Poisson counts)

40
Q

Models for excess zeroes:

A

Equidispersion:
- Zero-inflated Poisson
- Poisson hurdle model
Overdispersion:
- Zero-inflated negative binomial
- Negative binomial hurdle model

41
Q

How do you know which model for excess zeroes to use?

A

Often depends on knowledge/theory of how zeroes come about
Sometimes the zero-generating process is unknown; then a pragmatic decision might be made (e.g., based on model fit)

42
Q

Relative to the mean, what indicates overdispersion?

A

Large counts

43
Q

How can a zero-inflated model be expressed mathematically?

A

P(Yi = 0) = π + (1 - πi)e^-μi
P(Yi = y) = (1 - πi)μi^ye-μi / y! , y ≥ i
The first equation describes the probability of observing zero events. This probability is the sum of the probability of a structural zero (πi) and the probability of a sampling zero [(1 - πi)e^-μi
The second equation describes the probability of observing 1, 2, 3 or more events

44
Q

How many parameters in a zero-inflated model?

A

Two - π and μ, each of which appears in both equations. But we model these parameters separately
The probability π of having a structural zero is modelled via a logistic regression:
logit(πi) = Y0 + Y1X1i + Y2X2i + … - here, the coefficients are labelled ‘Y’ to make clear they are not the same coefficients as those in the Poisson part of the model
The mean μ of the counts that are not structural zeroes is modelled via a Poisson regression:
log(μi) = β0 + β1X1i + β2X2i + …

45
Q

In a zero-inflated model, do the predictors need to be in both model parts (logistic + Poisson)?

A

No - you can choose to have different predictors in each part of the model, or to use some predictors in both model parts, and other predictors only in one of them

46
Q

In a ZIP model, how are structural zeroes modelled?

A

Using a logistic regression

47
Q

In a ZIP model, if a coefficient is positive in the logistic part, what does that indicate?

A

More zeroes - smaller count of outcome variable (corresponds to a negative coefficient in the Poisson part)

48
Q

In a ZIP model, how are both model parts related?

A

They are both dependent on one another - changing something in the zero-inflation part will change the estimates in the Poisson part and vice versa

49
Q

How does ZI negative binomial regression work in relation to a ZI poisson model?

A

In essentially the same way, except that the counts are assumed to follow a negative binomial distribution

50
Q

In a ZIP model, how are coefficients interpreted? Consider an example IRR of 0.89 (95% CI 0.80-0.98) corresponding to a 10 percentage point difference in the proportion of lower class difference in relation to police operations

A

As in an ordinary Poisson regression, but being mindful that our estimates are conditional on how we adjust for zero-inflation (e.g., if we change predictors in the ZI part, the coefficient estimates in the Poisson part will also change. For example: “Our model estimates that a 10% percentage point difference in the proportion of lower class citizens is associated with fewer police operations by a factor of 0.89, adjusting for zero-inflation, where zeroes are predicted by lower10, vendors, and population.”

51
Q

How do you interpret the logistic part in a ZIP? Interpret in relation to police operations

A

The logistic part predicts zeroes. Thus:
- An OR > 1 indicates that a predictor is associated with more zeroes (fewer police operations)
- An OR < 1 indicates a predictor is associated with fewer zeroes (more police operations)
Interpretation needs to consider how we model the non-zero counts, i.e., the estimates from the logistic part are adjusted for the Poisson part.

52
Q

How are count regression models estimated?

A

By maximum likelihood

53
Q

What models can be compared using LRTs?

A

Models of the same type (e.g., negative binomial, ZIP) can be compared using LRTs
But:
- models without zero-inflation are not nested within zero-inflated models
- although the Poisson model is nested within the negative binomial model (as it is a special case of negative binomial regression), the ordinary LRT cannot be applied

54
Q

Is a Poisson model nested within a negative binomial regression?

A

Yes - it is nested within a negative binomial regression with the same predictors, because if the dispersion α = 0, then the two models are identical
But the ordinary LRT would give misleading results (the p-value would be too large and we would reject H0 too rarely)

55
Q

Why can we not use an ordinary LRT with a Poisson and negative binomial regression model?

A

α cannot be negative - so the test value 0 is “on the boundary of the parameter space.”
We therefore need to use a special test called the boundary LRT

56
Q

Hypotheses for the boundary LRT to compare Poisson and negative binomial models:

A

H0: α = 0
Or: “There is no overdispersion”
Or: “The NegBin model is no better than the Poisson model.”
H1: α > 0
Or: “There is overdispersion”
Or: “The NegBin model is better than the Poisson model”
A small p-value indicates evidence in favour of overdispersion, i.e., evidence against the Poisson model

57
Q

Where is the boundary LRT displayed in Stata?

A

In the output of a NegBin model alongside <chibar2(01)> for α = 0
It is displayed by default (compares negative binomial regression with Poisson regression model with the same predictors)

58
Q

What statistic does boundary LRT use?

A

Same as ordinary LRT, but calculates p-value in a different way

59
Q

If one variable is overdispersed, does that necessarily mean that overdispersion also exists in models of the same data?

A

No - sometimes additional predictors explain away the overdispersion

60
Q

Why can LRTs not be used to assess whether zero-inflation improves a model?

A

Models without zero-inflation are not nested in models with zero-inflation
We can use Akaike’s Information Criterion (AIC) instead

61
Q

What is AIC?

A

Can be used to compare nested and non-nested models
Is an information criterion, not a statistical test

62
Q

How is AIC expressed?

A

AIC = -2 x LL + 2 x k
Where:
- LL: Log likelihood
- k: number of parameters

63
Q

What does AIC do?

A

Weighs model fit (represented by the LL) against parsimony (represented by k)

64
Q

How to interpret AIC:

A

A smaller AIC indicates a better model
The numeric value of the AIC has no meaning in itself; it is meaningful only when comparing the AICs of different models estimated on the same data (with the same sample size)

65
Q

How many additional parameters does negative binomial regression have in relation to the other models encountered?

A

One (dispersion)

66
Q

What sentence needs to be included when interpreting zero-inflated models (logistic part)?

A

“adjusted for all other variables in the model, including both the Poisson and the logit parts”

67
Q

What are differences in the estimates from ZINB and ZIP model?

A

Estimates from ZIP and ZINB models are similar, but different; different assumptions -> different results
SEs are larger in the ZINB model, leading to wider CIs

68
Q

AIC - notes:

A
  • AIC has general applicability beyond count regression
  • Can be used to compare nested or non-nested models
  • AIC is one of several information indices: different information indices vary in the way they weight model fit and parsimony
69
Q

Zero-inflated models - summary:

A
  • Excess zeroes can sometimes be accounted for by explanatory variables
  • Poisson and negative binomial models are not nested in their zero-inflated counterparts
  • Use AIC (or other information criteria) for comparison