Business Forecasting Topic 7 Flashcards
multiple regression
- obtain more powerful result if include more than 1 independent (explanatory) variable in model
-> dependent variables explained better
model of multiple regression
Coefficient of multiple determination
R squared
goodness of fit of model to the data is measured by this
how well explained the variation in dependent variable
coefficient of multiple correlation
square root
R
changes in R squared
increases/fails to decrease = as no. of independent variables added to regression model increases
even if new independent variables have no relationship with dependent variable (not worth including) -> counteract this with adjusted R squared
adjusted r squared equation
big sample = small changes
more variables = adjusted R squared is smaller
corrected for unwanted variables
adjusted for increase in no of independent variables and the sample size
significance tests for the multiple regression model
- based on same assumption as bivariate regression model
adress:
1. validity of regression model
2. validity of individual regression coefficient
if I variable = no relationship with D variable = β have a value of zero and true pop relationship would have zeros
f-statistic
tests the statistical significance of the whole regression model itself
most computers = give F and associated p-value automatically
e.g. p< 0.0001 = reject H0 model has explanatory power and at least one I variable has non zero coefficient
significance of each regression coefficient
each independent = set up hypotheses
t - statistic = test null hypothesis
significance testing answers?
- arisen by chance - model useless for forecasting?
- goodness of fit
- variables by chance
multicollinearity
- some/all of I variables = highly correlated (related)
2 I highly correlated -> linear combination of a subset of independent variables highly correlated with another
f test vs t test
t = individual variables
f = whole regression model
problem with multicollinearity
bi’s in the regression model = very imprecise estimates of true regression co-efficients the βi’s -> unreliable
difficult to precisely show the contributions
multicollinearity can lead to…
- estimated coefficients having wrong signs (+ve vs -ve)
- misleading p values for t-tests -> decision whether to include is wrong
- doenst affect predictive ability of model. danger = misleading indications about relationship natures - to include or not
dealing with multicollinearity
- specialised technique
- correlated variables combined in single ‘super variable’ -> principal components analysis
-simplest procedure = 1 or 2 or variables highly correlated - mean model lose predictive power