Chapter 3 - Linear Regression Flashcards

1
Q

Simple linear regression (form, coefficients, SE, CI)

A

yi = beta0 + beta1*xi + epsilon_i. estimates betahat1 and betahat2 are chosen to minimize RSS . you can calculate standard error (SE(betahat1)) and 95% CI for the coefficients (B +- 2 * SE(B)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is RSS? TSS?

A

residual sum of squares. (sum from 1 to n of (yi - yhat_i)^2.

total sum of squares. same but ybar instead of yhat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hypothesis test

A
Ho  = there is no relationship between X and Y (i.e., Beta1 = 0)
Ha = there is some relationship (i.e., Beta1 != 0)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Test statistic for hypothesis test

A

t = betahat_1 / SE(betahat_1). under the null hypothesis this has a t-distribution with n-2 degrees of freedom (be careful about conclusions drawn from this). Pvalue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

P value

A

if p value is < threshold, the values gained are too extreme to occur often within the null hypothesis (relies on an assumption of homoscedasticity - assumption of constant standard deviation). this relies very strongly on the model we are using. low p value means statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Multiple Linear Regression

A

Y = Beta0 + B1*X1 + … + BetapXp + epsilon. Pad X (a data matrix) with an extra column on the left for the intercept.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Questions answered by multiple linear regression (4)

A

1) is at least one of the variables Xi useful for predicting the outcome Y
2) which subset of predictors is most important
3) how good is a linear model for the data
4) given a set of predictors, what is the likely value for Y, and how accurate is that estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we estimate the beta_i s?

A

minimize RSS, which for multiple linear regression is the HAT vector. Betahat = (X^T*X)^-1 * X^T * y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which variables are important?

A

consider the hypothesis that the last q coefficients are 0. RSS0 is the RSS for the model which excludes those variables. the F-statistic under the null hypothesis has an f distribution. The null hypothesis is the intercept. WARNING: if number of variables is large, some p values will be arbitrarily low under the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

F statistic (relationship to hypothesis test)

A

[(RSS0 - RSS)/q]/[RSS/(n-q-1)] also (TSS - RSS)/p // RSS/(n-p-1). the t-statistic associated with i-th predictor is the sqrt of the F-statistic for the null hypothesis which sets only Beta_i = 0. Large f-statistic indicates that at least one of the variables is related to the repsonse. (if H0 is true F ~ 1, if Ha is true, F&raquo_space; 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How many variables are important?

A

there are many choices for subsets of variables (2^p). select a range of models: 1) forward selection: starting from a null model, include variables one at a time (add in smallest remaining P value). minimize RSS after each step 2) backward selection: start from a full model, remove largest p-value 4) mixed selection: null model, add one at a time, minimize RSS at each step, throw out p values over a threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

tuning

A

choosing one model in the range produced from a model selection method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dealing with categorical or qualitative predictors

A

for each qualitative predictor: 1) choose a baseline category (e.g., african american) 2) for every other category, define a new predictor (Xasian is 1 if person is asian and 0 otherwise) 3) Beta_Asian is the relative effect for being Asian compared to baseline category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Does order matter? Choice of baseline?

A

When tuning, yes order matters.

model fit and predictions in qualitative predictors is independent of the choice of baseline. However, hypothesis tests derived from these variables are dependent on choice (the effect of being one ethnicity compared to baseline). Solution: to check if ethnicity is important, use an F-test for the hypothesis Beta_Asian = 0 (independent of coding).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How good are the predictors?

A

“confidence intervals” reflect the uncertainty on beta. “prediction” intervals reflect the uncertainty on beta and the irreducible error epsilon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How good is the fit?

A

focus on the residuals (training points). 3 methods for assessing fit: R^2, RSE, Visualizing data in interesting ways

17
Q

R^2

A

method for assessing goodness of fit of a linear regression model. R^2 = Corr(Yhat,Y) = 1 - RSS/TSS where TSS is the total sum of squares, increases as you add more variables (training error, starts to overfit)

18
Q

RSE

A

method for assessing goodness of fit of a linear regression model. RSE = residual standard error, related to RSS, how sure you are on the magnitude of the residuals. RSE = sqrt(RSS/(n - p - 1)) = sqrt(RSS/(n-2))