Chapter 8: Introduction To Linear Regression Flashcards
What is the goal of linear regression?
To model the relationship between two numerical variables.
What is the regression line equation?
ŷ = b₀ + b₁x
What does the slope (b₁) represent?
The change in the response variable for a one-unit change in the explanatory variable.
What does the intercept (b₀) represent?
The predicted value of the response variable when x = 0.
What is a residual?
The difference between the observed value and the predicted value (eᵢ = yᵢ - ŷᵢ).
What method is used to find the best-fitting line in linear regression?
Least squares method.
Why do we use squared residuals instead of absolute values?
Because they are easier to compute and penalize larger errors more.
What are the conditions for using linear regression?
Linearity, nearly normal residuals, and constant variability.
What is homoscedasticity?
When the variability of residuals is approximately constant across all x values.
What does R² represent?
The proportion of variability in the response variable explained by the model.
How is R² calculated?
As the square of the correlation coefficient (R² = r²).
What is the purpose of inference for the regression slope?
To test if there is a significant linear relationship between x and y.
What are the null and alternative hypotheses for slope inference?
H₀: β₁ = 0; Hₐ: β₁ ≠ 0.
What is the test statistic formula for regression slope inference?
T = (b₁ - 0) / SE_b₁
What is the degrees of freedom for the regression t-test?
df = n - 2
How do you interpret the slope in context?
For each 1-unit increase in x, y is expected to change by b₁ on average.
What is extrapolation in regression?
Making predictions outside the range of observed data.
What is an outlier in linear regression?
A point that lies far from the others.
What is a high leverage point?
A point far from the center of the x-values.
What is an influential point?
A point that significantly affects the slope of the regression line.