Count Data Flashcards
1
Q
Problem with count data
A
- NV is violated
- NV is just for continuous variables
- NV allows values below 0 -> count not
- NV is symmetrical -> count not
2
Q
GLM
A
- link function = log
- log transform expected value
-> expected value always positive
But: definition of residuum is no longer clear -> autoplot() automatically picks the right ones
3
Q
Overdispersion
A
If Residual Deviance»_space; Df
-> too small p-values
-> probably some x are missing
-> p-value of X^2 test for overdispersion (in anova) will be very small
-> use Quasipoisson
4
Q
Underdispersion
A
If Residual Deviance < Df
-> too large p-values
-> Quasipoisson
5
Q
Zero-inflation
A
Overrepresentation of 0
-> use new model
6
Q
Dispersion
A
Residual Deviance : Df =
>1 = Overdispersion
<1 = Underdispersion
7
Q
Link function
A
Used to transform the expected values of the response variable (y)
NOT of explanatory variables (x)
NOT of observed values of response variable (y)