Business Forecasting Topic 7 Flashcards

1
Q

multiple regression

A
  • obtain more powerful result if include more than 1 independent (explanatory) variable in model

-> dependent variables explained better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

model of multiple regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Coefficient of multiple determination

A

R squared
goodness of fit of model to the data is measured by this

how well explained the variation in dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

coefficient of multiple correlation

A

square root
R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

changes in R squared

A

increases/fails to decrease = as no. of independent variables added to regression model increases

even if new independent variables have no relationship with dependent variable (not worth including) -> counteract this with adjusted R squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

adjusted r squared equation

A

big sample = small changes
more variables = adjusted R squared is smaller

corrected for unwanted variables

adjusted for increase in no of independent variables and the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

significance tests for the multiple regression model

A
  • based on same assumption as bivariate regression model

adress:
1. validity of regression model
2. validity of individual regression coefficient

if I variable = no relationship with D variable = β have a value of zero and true pop relationship would have zeros

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

f-statistic

A

tests the statistical significance of the whole regression model itself

most computers = give F and associated p-value automatically

e.g. p< 0.0001 = reject H0 model has explanatory power and at least one I variable has non zero coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

significance of each regression coefficient

A

each independent = set up hypotheses
t - statistic = test null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

significance testing answers?

A
  1. arisen by chance - model useless for forecasting?
  2. goodness of fit
  3. variables by chance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

multicollinearity

A
  • some/all of I variables = highly correlated (related)

2 I highly correlated -> linear combination of a subset of independent variables highly correlated with another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

f test vs t test

A

t = individual variables

f = whole regression model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

problem with multicollinearity

A

bi’s in the regression model = very imprecise estimates of true regression co-efficients the βi’s -> unreliable
difficult to precisely show the contributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

multicollinearity can lead to…

A
  1. estimated coefficients having wrong signs (+ve vs -ve)
  2. misleading p values for t-tests -> decision whether to include is wrong
  • doenst affect predictive ability of model. danger = misleading indications about relationship natures - to include or not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

dealing with multicollinearity

A
  • specialised technique
  • correlated variables combined in single ‘super variable’ -> principal components analysis

-simplest procedure = 1 or 2 or variables highly correlated - mean model lose predictive power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

finding multicollinearity

A

if exists in problem -> look at correlation matrix

if explanatory variable correlated highly with another explanatory = chance of multicollinearity

  • or variance inflation factor (VIF) and tolerance (inverse of VIF)
    largest VIF > 10 = multicollinearity
17
Q

Dummy variables

A

assume values of either 0 or 1

nominal variables - often increase explanatory or predictive power of regression model

multicollinearity among dummy variables = delta by reducing number of dummy variables in equation by 1

18
Q

perfect multicollinearity

A

impossible to estimate model

19
Q

standardised regression coefficients

A
  • interpretation made easier if coefficients are standardised

standardisation = makes α the constant in regression model disappear - new regression coefficient = beta coefficients

20
Q

beta coefficients

A
  • dont depend on units of measurement of different variables
  • give better idea of relative importance of independent variables -> useful to model consumer choice
  • tell us relative importance to consumers of different attributes of product
21
Q

building a regression model

A
  • multicollinearity complicate decisions on which variables to include
  • prerequisite = adequate theory/rationale to justify decision to include variable
  • may be justified to include I variable even if p-value on t test is large
22
Q

automated approaches to model building

A
  1. step wise regression
  2. best subsets regression
23
Q

step-wise regression

A

forward-backward step-wise method
1. identify bivariate model choosing I variable highly correlated with dependent
2. add most predictive power selected and new regression equation
3. both I variables tests to see inclusion is justified

24
Q

limitations of stepwise method

A
  1. potentially useful variable excluded if multicollinearity is present
  2. repeated significance tests reduces their power - chose conservative levels when determining inclusion or exclusion (e.g. 1%)
25
Q

best sub sets regression

A

every model, every combination of I = identified and model with greatest predictive power is selected

26
Q

Forced entry method

A

Enter in SPSS

all variables considered to making contribution towards outcome (dependent variable) forced into multiple regression = model obtained

initial model of explanatory not significant are removed =final model

27
Q

concerns in using highest adjusted R squared to chose model

A

-only exceed low threshold to be included - t value only exceeds 1 = model has too many variables
conflicts with principle of parsimony (model as simple as possible)
s

28
Q

concerns with using R squared alone

A

make us think model is improved through including variables