week 5 DSE Flashcards
What is the disadvantage of running simple linear regressions separately instead of multiple?
ignores potential confounding factors or
synergy effect, leading to misleading results
If there are 2 or more independent vairbales, how to find error?
use least squares to find the regression plane that best fits the data.
If there are 2 predictors, how many parameters are we supposed to estimate
3
p+1
What is the eror term of the ith point?
ϵi = yi − yˆi
What does b2 =188 mean?
Holding the expenditure on x1 constant, every increase in x2 by 1 unit increases sales by 188 units
How many degrees of freedom does RSE have for multiple linear model
n-(p+1)
What is R^2?
fraction of variance explained
What is a flaw of R^2 in multilinear regression?
value never decreases, even if we add redundant variables to the regression model.
What causes the flaw in R^2 in multilinear regression?
equation solves for the coefficients such that RSS are minimized
If the variable does not improves model fit, the estimated coefficient will be zero. BUT R2 remains unchanged.
If the variable improves model fit, the estimated coefficient will be nonzero ⇒ R2 increases.
r^2 cannot decrease, can only remain same or increase
Use adjusted R^2
Can adjusted R^2 be negative?
Yes
What is the formula for penalization factor?
It is always ______
(n-1)/ ( n-(p+1) )
It is always larger than 1
How does R^2 adjusted compare to R^2?
- always smaller, can be negative
- when new independent variable added, r^2 decreases
INCLUDES PENALIZATION FACTOR
What is H0 and H1 for multiple lienar regression?
H0 : β1 = β2 = … = βp = 0
H1 : at least one βj is not zero.
What test for multi regerssion hypothesis testing?
F-statistic
refer to P-VALUE of f stats from R output table
What is interpolation?
Predicting Y for a value of X that is within the range of the original data