Chapter 9: Multiple & Logistic Regression Flashcards
What is the form of a multiple regression model?
ŷ = b₀ + b₁x₁ + b₂x₂ + … + bₖxₖ
What does each coefficient in a multiple regression represent?
The effect of that variable on the response, holding other variables constant.
What is the purpose of adjusted R²?
To account for the number of predictors and prevent overfitting.
What is the formula for adjusted R²?
1 - [(SSE/SST) × (n-1)/(n-p-1)]
When is a variable considered significant in regression?
When its p-value is less than 0.05.
What does a large p-value suggest about a predictor?
That it may not contribute meaningfully to the model.
What is collinearity?
Correlation between two or more predictors that complicates model estimation.
Why is collinearity a problem?
It makes it difficult to estimate individual effects of predictors.
What is backward elimination?
A model selection method that removes predictors one at a time based on adjusted R² or p-values.
What is forward selection?
A model selection method that adds predictors one at a time based on adjusted R² or p-values.
What does the intercept represent in multiple regression?
The predicted response when all predictors are 0 (may not be meaningful).
What is the interpretation of a binary categorical predictor’s coefficient?
The difference in the response between that category and the reference group.
What does R² tell us in multiple regression?
The proportion of variance in the response explained by the model.
Why use adjusted R² over regular R²?
Adjusted R² accounts for the number of predictors, avoiding inflation from adding irrelevant variables.
What does a p-value less than 0.05 indicate for a coefficient?
That the predictor is statistically significant.
What is the goal of model selection?
To find the simplest model that adequately explains the data.
How is prediction made using a regression model?
By plugging in values of predictors into the model equation.
What does it mean to control for other variables?
To isolate the effect of one variable while holding others constant.
Why might the intercept in a model be meaningless?
Because it represents the prediction when all variables are zero, which may not be realistic.
What is the main advantage of multiple regression?
It allows for controlling multiple variables and assessing their individual contributions.