13 - Generalized Linear Models Flashcards
What are General Linear Models (GLMs)?
A family of linear models that includes regression for continuous, numeric discrete, and binary response variables.
GLMs relate different types of response variables through specific link functions.
What is the linear predictor in a multiple regression model?
The sum β0 + β1x1 + β2x2 + ⋯ + βp.
It can be abbreviated as Xβ.
What is the link function in GLMs?
A function that connects the linear predictor to the mean μ of the response variable, denoted as g(μ).
Different link functions correspond to different types of regression models.
What is the identity link function used for?
It is used in linear regression when the response variable has a Normal distribution.
In this case, Xβ = g(μ) = μ.
What does logistic regression predict?
It predicts a binary response variable, such as whether a customer has a store credit card.
The response variable takes values 1 (Yes) or 0 (No).
What is the link function for a binary response variable in logistic regression?
g(μ) = ln(μ/(1-μ)).
This ensures the mean value μ will always be between 0 and 1.
What is the formula to isolate μ in logistic regression?
μ = e^(Xβ) / (1 + e^(Xβ)).
This model estimates the probability that y = 1.
How do you interpret the coefficient of a binary predictor variable in logistic regression?
It describes the estimated change in the log-odds of the response variable when the predictor variable increases by one.
For example, a coefficient of 1.254 indicates a customer is about 3.5 times more likely to have a store credit card if they have a web account.
What does the coefficient for Days between Purchases indicate in logistic regression?
For every additional day between purchases, the customer is 0.4% less likely to have a store credit card.
Multiplying by 30 shows that for every 30 days without a purchase, the likelihood decreases by 11%.
What command is used to perform logistic regression in Python?
sm.Logit(y, X).fit().
y is the response variable and X includes predictor variables.
What is Poisson regression used for?
It is used to predict a count of events, such as the number of customer service contacts.
The response variable is a count with a minimum value of zero.
What is the link function for a count response variable in Poisson regression?
g(μ) = ln(μ).
This connects the linear predictor to the mean of the count response.
How is the Poisson regression model expressed in parametric form?
y = e^(β0 + β1x1 + β2x2 + … + βp).
This can also be written in descriptive form.
How do you interpret the coefficient in Poisson regression?
When used as the exponent of e, it describes the estimated multiplicative change in the response variable when the predictor increases by one.
For example, a coefficient of 0.4305 increases the predicted number of calls by 53.8% when moving from a non-churning to churning customer.
What command is used to perform Poisson regression in Python?
sm.GLM(y, X, family=sm.families.Poisson()).fit().
y is the response variable, and X includes predictor variables.
What are the three cases of regression response variables discussed in this chapter?
- Binary response variable
- Count response variable
- Continuous response variable
What category of regression models includes all three cases of response variables?
Generalized Linear Models (GLM)
What do we call the linear predictor?
Linear predictor
How do we write the linear predictor in its abbreviated form?
η
The link function connects what two things?
Linear predictor and response variable
How do we write the link function in its abbreviated form?
g(μ)
What is the link function for linear regression?
Identity link function
What kind of regression should we use when trying to predict a binary response variable?
Logistic regression
What is the link function for logistic regression?
Logit link function