Basics of Multiple Regression Flashcards
When should you use Logistic regression models?
If the dependent Y variable is discrete
If out independent X variables is qualitative
When should you use Multiple regression models?
When the dependent variable is continuous (not discrete) and there is more than one explanatory variable (more than one dependent variable).
When multiple independent variables determine the outcome of a single dependent variable.
Dependent Y Variable is continuous
We have more than 1 Dependent Y variable
Assumption of Regression models
L.I.I.N.H.
Linearity: Relationship between dependent Y variable and Independent X variable is linear.
Independent of Errors: Regression residuals are uncorrelated across observation.
Independent: Independent X variable is not random, there is no exact linear relationship between 2 or more independent variables.
Normality: Regression residuals are normally distributed.
Homoscedasticity: Constant variance of regression residuals
How to determine if a variable is significant?
|T-Stat| > 1
Degrees of freedom for SSR
N-k
Degrees of freedom for SST
N-1
Degrees of freedom for SSE
N-K+1
What will happen to adjusted R-Square if we have insignificant varibles
Adjusted R-Square decreases
R-Square formula
SSR/SST = Explained Variation / Unexplained variation
1-(unexplained variation/total variation)
What kind of test is this?
H0: bi = Bi
Ha: bi /= Bi
Two tail test
What kind of test is this?
H0: bi <= Bi
Ha: bi > Bi
Right tail test
<= - is heading right
What kind of test is this?
H0: bi => Bi
Ha: bi < Bi
Left tail test
=> is heading left
Formula and purpose of AIC
AIC = n * ln(SSE/n)+ 2(K+1)
AIC is better for forecasting purposes
Formula and purpose of BIC
BIC = n * ln(SSE/n) + Ln(n)(k+1)
Better for evaluating goodness-of-fit
How do we test joint coefficients?
F-Stat
[(SSE restricted - SSE unrestricted) / q] / (SSE unrestricted / N-k-1)
alternative formula…
(SSE restricted - SSE unrestricted) x ( N-K-1) / (SSE unrestricted x Q)
-SSE restricted: Model 1 does not include the two variables we want to test, so it is the restricted model.
SSE unrestricted: Model 2 that includes the two variables we want to test, so it is the unrestricted model.
Adjusted R square formula
1 - [(n-1)/(n-k-1)] x (1-R Squared)
Holding all other variables constant, the adjusted R-Square will decrease when all of the following variables increase expect…
The number of observation
Formula For F-Test
MSR / MSE
[ RSS / K ] / [SSE / n-(k+1) ]
[ Regression / K ] / [ Residual / n-(k+1) ]
Hypothesis test for F test
F test > F stat : Reject null. b1 = b2 = bn = 0
F test < F stat : Fail to reject null. b1 =/ b2 =/ bn =/ 0
Are the coefficents correlated?
y=2+3x1
y=1.5+2x1+ 3x2
Yes, because when we added an aditional coefficient, their values changed
The null hypothesis for F-test
All regression coefficients are equal to zero. In other words, none of the independent variables have a significant effect on the dependent variable.
H0 = β1=β2 =β3 =⋯=βk=0
This implies that the model has no explanatory power, and the variation in the dependent variable 𝑦.
y is not explained by the independent variables.
The Alternative hypothesis for F-test
At least one regression coefficient is different from zero. In other words, at least one independent variable has a significant effect on the dependent variable.
H1:Atleastoneβi /=0 (forsomei=1,2,…,k)
How Calculate the joint F-statistic
[(SSE of restricted model−SSE of unrestricted model)/𝑞] / SSE of unrestricted model/(𝑛−𝑘−1)
restricted model: The model that does not include the coefficient. Value is obtained from SS in Residual row.