Linear Regression Flashcards

Question 1

Q

Assumption of Linear Regression

Answer

A

E(y|x) = f(X) is a linear function
residual independent normal distribution with 0 mean and constant variance
residual independent with X
X no multicollinearity
number of sample more than number of features
variability in X is positive
no auto-correlation in residual

http://r-statistics.co/Assumptions-of-Linear-Regression.html

Question 2

Q

Estimate of Linear Regression parameters

Answer

A

OLS: minimize the Residual Sum of Squares
Normal equation: hat(beta) = (X^T X)^-1 X^T y
Confidence interval of parameters
hat(beta) ~ N(beta, (X^T X)^-1 \sigma^2)
where sigma^2 is the variance of residual

Question 3

Q

t-test for Linear Regression parameter

Answer

A

hypothesis: beta_i = 0
t_score = beta_i / (hat(sigma) * sqrt(v_i))
t_{N-p-1} distribution
calculate p value

Question 4

Q

F-test for Linear Regression parameters

Answer

A

hypothesis: beta_i+1 = beta_i+2 = … = beta_i+k = 0
F = \frac{(RSS_small - RSS_large)/k}{RSS_large/(N-i-k-1)}
F_{k, N-i-k-1} distribution
calculate p value

Question 5

Q

bias vs. variance

Answer

A

MSE(hat(y)) = MSE(hat(y)) + sigma^2
where sigma^2 is the variance of residual, f(y) = x^T beta

MSE(x^T hat(beta)) = Var(x^T hat(beta)) + [E(hat(beta)) - beta]^2
first term is variance, second term is bias

(*OLS is the estimate with smallest variance among all unbiased estimates, but for other biased estimates there could be solutions that gives smaller MSE)

Question 6

Q

ways to reduce variance of hat(beta) and optimize MSE

Answer

A

feature selection
shrinkage (ridge, lasso)
dimension reduction

Question 7

Q

Ridge

Answer

A

regularize with l2 norm of parameters

penalize proportional to the amplitude of the parameter

Question 8

Q

Lasso

Answer

A

regularize with l1 norm of parameters

set the parameter estimate to 0 when they are below a certain value

Question 9

Q

R^2

Answer

A

R^2 = 1 - frac{RSS_res}{RSS_tot}

Question 10

Q

adjusted R^2

Answer

A

Adjusted R^2 = 1 - frac{RSS_res/df_res}{RSS_tot/df_tot}
df_res = n - p - 1
df_tot = n - 1
an unbiased (or less biased) estimator of the population R2, more appropriate when evaluating model fit and in comparing alternative models in the feature selection stage of model building

Linear Regression Flashcards

(10 cards)