Generalized Linear Models Flashcards
What is the formula for the Poisson distribution?
What is the relationship between the expected value of Y and the variance of Y? E(Y) & Var(Y)
Pr(Y = k) = e^−λ * λ^k / k! for k = 0, 1, 2, . . . .
Pr(Y = k) = probability that outcome variable Y = some integer value k
e^−λ = e to the negative power of the expected value of Y (λ = EY)
k! = k factorial
The expected value of Y is actually equal to the variance of Y. Thus, the larger the mean of Y the larger the variance.
How is the expected value of Y, E(Y) modeled in poisson regression?
log(λ(X1, . . . ,Xp)) = β0 + β1X1 + · · · + βpXp or equivalently…
λ(X1, . . . ,Xp) = eβ0+β1X1+···+βpXp.
Notice that in the top equation, expected value of Y (λ) is transformed using the log transformation. This is so that Y can’t take on negative values since Y is typically a count variable.
What method is used to estimate the coefficients in poisson regression?
MLE
Interpret the coefficients of this poisson regression output. What is the relationship between cloudy/misty weather and the number of bikers?
of bikers = outcome variable
clear skies is the comparison for weathersit
Variable, coef, stderror, z-stat,p-value
Intercept 4.12 0.01 683.96 0.00
workingday 0.01 0.00 7.5 0.00
temp 0.79 0.01 68.43 0.00
weathersit [cloudy/misty] -0.08 0.00 -34.53 0.00
weathersit [light rain/snow] -0.58 0.00 -141.91 0.00
weathersit [heavy rain/snow] -0.93 0.17 -5.55 0.00
An increase in Xj by one unit is associated with a change in E(Y) = λ by a factor of exp(Bj).
For example, a change in weather from clear skies to cloudy/misty weather is associated with a change in mean bike usage by a factor of exp(-0.08) = 0.923. This means that on average only 92.3% as many people on bikes when it is cloudy relative to when it is clear
This interpretation is due to the formula… λ(X1, . . . ,Xp) = eβ0+β1X1+···+βpXp . All coefficients are expressed as e^(Bj)
What can poisson regression models do in terms of the mean-variance relationship that linear regression can’t?
Poisson models can handle mean-variance relationships in which the variance changes as the mean changes. Linear regression assumes the variance doesn’t change.
What family of distributions does linear, logistic, and poisson regression follow?
Linear - guassian
logistic - bernoulli
poisson - poisson
What is a link function and what is the link function for linear, logistic, and poisson regression?
A link function applies a transformation to the expectation of Y, (E(Y | X1, . . . ,Xp)), so that the transformed mean is a linear function of the predictors.
Linear regression link function - η(μ) = μ (identity function)
logistic regression - η(μ) = log(μ/(1 − μ))
Poisson regression - η(μ) = log(μ)
What is a generalized linear model (GLM)?
Any regression approach that models the response Y as coming from a particular member of the exponential family, and then transforming the mean of the response so that the transformed mean is a linear function of the predictors.
Exponential family - guassian, bernoulli, poisson, gamma, negative binomial - these are all distributions
Thus linear, logistic, and poisson regression are three examples of GLM.