1. randomly split observations into k groups 2. fit training data (obs=n-n1) into a model 3. validate the model using testing set (obs=n1) 4. compute to test MSE for first round 5. repeat 2-4 for k times to obtain k MSEs 6. construct K-FOLD CV estimate as avg of k MSEs

Multiple linear regression Flashcards by Eliza Ong

disadvantage of doing regressions separately?

ignores potential synergy effect–>lead to misleading results

How well did you know this?

Not at all

Perfectly

RSE?

sqrt(1/(n-p-1) * RSS). n-(p+1) denominator is the degrees of freedom

How well did you know this?

Not at all

Perfectly

why does R squared increase when non-zero inputs are added?

RSS will decrease when choosing another (non-zero) parameter to estimate logic for finding a new set of coefficient based on minimising RSS

How well did you know this?

Not at all

Perfectly

what does adjusted R squared do?

adds a penalisation factor to account for the number of predictors included in the model

How well did you know this?

Not at all

Perfectly

formula of adjusted R square?

1-(n-1)/(n-1-p)*RSS/TSS. ALWAYS SMALLER THAN R SQAURE(can be neg)

How well did you know this?

Not at all

Perfectly

what is the null hypothesis H0?

all regression coefficients are 0 simultaneously

How well did you know this?

Not at all

Perfectly

formula of F stats?

((TSS-RSS)/P)/(RSS/(n-1-p))

How well did you know this?

Not at all

Perfectly

when no relation what is F stat?

How well did you know this?

Not at all

Perfectly

when to reject null hypothesis?

p-value<0.05

How well did you know this?

Not at all

Perfectly

how does forward selection work?

start with null model with intercept but no predictor
successively include most informative variable (lowest RSS, highest R square)
stop when stopping rule is reached (all variables have p-value<0.05)

How well did you know this?

Not at all

Perfectly

how does backward elimination work?

start with full model with intercept and all predictors
successively remove least informative variable (highest RSS, lowest R squared)
stop when stopping rule is reached (all variables have p-value<0.05)

How well did you know this?

Not at all

Perfectly

how does cross validation work?

split dataset into training and testing set
train model using training set
validate fitted model using testing set

How well did you know this?

Not at all

Perfectly

how is validation error rate assessed?

mean squared error (1/n*RSS)

How well did you know this?

Not at all

Perfectly

process of leave one out CV?

1,fit training data (obs=n-1) into a model

validate the model using testing set (obs=1)
compute to test MSE for first round
repeat 1-3 for n times to obtain n MSEs
construct LOOCV estimate as avg for MSEs

How well did you know this?

Not at all

Perfectly

K-fold CV?

randomly split observations into k groups
fit training data (obs=n-n1) into a model
validate the model using testing set (obs=n1)
compute to test MSE for first round
repeat 2-4 for k times to obtain k MSEs
construct K-FOLD CV estimate as avg of k MSEs

How well did you know this?

Not at all

Perfectly

lm for multiple linear regression?

Study These Flashcards

lm_fit=lm(y~var1+var2,data=Boston)

summary(lm_fit)

get model with all variables?

Study These Flashcards

lm_fit1=lm(y~. , data=Boston)

remove one or two variables?

Study These Flashcards

lm_fit2=lm(y~. -var1 , data=Boston)

lm_fit3=lm(y~.-var1 -var2, data=Boston)

get null set?

Study These Flashcards

lm_fit4=lm(y~1,data=Boston)

get correlation for all inputs?

Study These Flashcards

round(cor(Boston),2)

how to visualise the pair-wise correlation matrix?

Study These Flashcards

install.packages(‘corrplot’)
library(corrplot)
cor_matrix=round(cor(Boston), 2)
corrplot(cor_matrix, type = “upper”, order = ‘alphabet’,
tl.col = “black”, tl.srt = 45, tl.cex = 0.9,
method = “circle”)

for variables with high correlation, how to remove additive assumption?

Study These Flashcards

lm_non_add=lm(y~var1*var2, data=Boston)

scatterplot with linear assumption?

Study These Flashcards

attach(Boston)

plot(var1,y,pch=16,col=’gray50’)

for polynomial?

Study These Flashcards

lm_nonlinear=lm(y~poly(var1,2),data=Boston)

add line to scatterplot of non linear?

lines(sort(y),fitted(lm_nonlinear)[order(y)],lwd=2,col='deeppink3')

find just the coefficient of variable n intercept?

glm_fit=glm(y~x,data=Boston) | coef(glm_fit) #same as lm_fit

find LOOCV estimates?

cv_err=cv.glm(Boston,glm_fit) | cv_err$delta[1]

finding CV of different models and lowest MSE?

glm_fit2=glm(y~poly(x,2), data=Boston) cv_err=cv.glm(Boston,glm_fit2) OR USE FOR LOOP cv_error=rep(0,10) for (i in 1:10){glm_fit=glm(y~poly(x,i), data=Boston) cv_error[I]=cv.glm(Boston,glm_fit)$delta[1]}

plotting CV errors?

plot(cv_error,xlab='polynomial',main='test MSE', ylab='',type='b', pch=16)

CV for K fold?

cv_error=rep(0,10) for (i in 1:10){glm_fit=glm(y~poly(x,i), data=Boston) cv_error[I]=cv.glm(Boston,glm_fit,K=10)$delta[1]

Multiple linear regression Flashcards

(30 cards)