W5: GLM 2 Flashcards
What is the y variable for poisson distribution?
Discrete numeric, whole, and positive numbers
No negative integers
What type of distribution should be used for this RQ:
“Examining risk factors for the number of accidents someone gets into over a 12 month period”
Poisson
What type of distribution should be used for this RQ:
“Evaluating whether an intervention reduced the number of times someone
missed their medication in the last month”
Poisson
What type of distribution should be used for this RQ:
“Testing whether the total number of health care appointments over six months can be lowered by treating mental health”
Poisson
How many parameters does the Poisson distribution have and what are they called?
1 parameter: Lambda
Both the mean AND variance
What does the Poisson distribution look like when lambda gets higher (e.g when lambda = 10)?
More like a normal distribution
What assumption is violated for both Poisson and logistic regression?
Normality assumption
Why can’t we use linear regression for count outcomes, and Poisson instead?
Straight line is bad fit for only positive outcomes
What is the link function for Poisson distribution and what does it do?
Natural log (ln (lambda)
Transforms eta so it never goes below 0
Unbounds lambda on the left side (y axis) of the graph
Log of 0, ln(0) = negative infinity
After link transformation, what does the data fall between for Poisson and logistic distribution?
Negative infinity to positive infinity
i.e continuous unbounded outcome to apply to linear model
What is the variance if lambda is 0?
0
What does the inverse link function do for Poisson and logistic distribution?
Poisson: y axis (left side of graph) falls back to the original count scale (between 0 and 1)
Logistic: y axis falls back to probability scale (between 0 and 1)
What are the 3 assumptions of Poisson and logistic regression?
- Errors must be independent
- Assumes linear relationship on the link (natural log / logit) scale
- Requires large sample size (no dfs, so it’s for parameters to be normally distributed)
What argument must be added to glm() and testDistribution() for Poisson and logistic regression?
glm( y ~ x, data = d, family = poisson() )
or family = binomial()
testDistribution( d$awards, distr = “poisson”)
How do you interpret the estimate for the predictor using Poisson regression?
glm ( num_awards ~ math)
Each 1 unit higher math score is associated with x high…
log awards
Instead of interpreting Poisson regressions on log scale, what should we use instead and how do we get that?
Incident rate ratios (IRRs) by exponentiating regression coefficients.
What do IRRs indicate (Poisson)?
How many more times y will be for 1 unit change in x
i.e the ratio of how much y is expected to change in count numbers
If IRR = 4, base rate = 2, how many more times will outcome be for 1 unit change in predictor?
4 * 2 = 8
What does it mean if IRR or OR = 1?
What would the coeff value be on link (log / log odds) scale?
There is no change in number of times the outcome will be (1 x 1 base rate = 1)
or no change in number of time the odds of outcome (1 x 1 base odds = 1)
Coeff of 1 on IRR or OR scale = coeff of 0 on link scale
What 3 things should you not exponentiate?
p-values, z values, standard errors
What are 2 things you can exponentiate for poisson regression?
regression coefficients : exp(coef)
confidence intervals : exp(confint)
What argument do you have to add to visreg when you want to plot poisson or binary logistic regression on the original scale?
scale = “response”