Module 7: Count Data Flashcards
What is count data?
- Dependent variable is nonnegative integer
- Often, few small discrete values
- zeros present
- right skew
- heteroskedastic
Ex. Fertility, airline safety, visits to docor, days in hospital
List the types of count data models.
- Poisson
- Negative binomial
- Zero inflated (both poisson and negative binomial)
Define equidispersion.
When the mean of a distribution is equal to the variance.
Given the following poisson output, interpret age.
y | coeff. | std. err.
age | 0.0676 | 0.001
educ | -0.0509 | 0.002
1.urban | -0.099 | 0.0217
1.usemeth | 0.517 | 0.026
Where the dependent variable is number of children.
Each year increase in age is associated with 6.8% more children.
Given the following poisson output, interpret usemeth.
y | coeff. | std. err.
age | 0.0676 | 0.001
educ | -0.0509 | 0.002
1.urban | -0.099 | 0.0217
1.usemeth | 0.517 | 0.026
Where y is number of children and usemeth is a dummy for birth control.
Using birth control is associated with 68% more children.
(e^0.5171)-1 = .68
How do you choose between the Poisson model and the negative binomial model?
H0: poisson, equal disposition
H1: negataive binomial, overdisposition
Use chi^2 value and < 0.05, reject the null and poisson preferred
When do you use zero-inflated models?
- Have excess zeros in y
- Poisson and NB under-predicted nonuse (zeros)
How do you interpret a coefficient for zero-inflated poisson model, inflated on the constant?
Each 1 unit increase in x leads to a ___% more y.
Given the following zero-inflated poisson, interpret age.
y | coef. | std. err.
age | 0.0561 | 0.0013
educ | -0.043 | 0.002
inflated
age | -0.8027 | 0.067
educ | 0.209 | 0.0518
(Inflated on the x-variables)
Each year of increase in age is associated with 5.6% more children.
Each year increase in age is associated with an 55% decrease in the probability of having no children. (exact formula)
How do you choose between each of the count data models?
- Look at histograms
- alpha significance levels
- correlation between y and predicted y
- probabilities of various y outcomes
- AIC/BIC
T/F: Linear models have probabilities and AIC/BIC
False: They do not have these.