Business Forecasting Topic 6 Flashcards

1
Q

Regression Analysis

A

relationship between variable wanting to predict & other variable = explanation of behaviour

  • various aspects of relationship between criterion and 1 or more explanatory variables (effect of explanatory on criterion)
  • must distinguish between 2
    -used for forecasting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

criterion

A

dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

explanatory variable

A

independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

correlation

A

measure strength of association between 2 variables

  • initial assessment = scatter diagram
  • scatter -> dependent variable = vertical axis (y)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

regression

A

describe nature of association between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

regression used to

A
  1. provide understanding of relationship between variables (effectiveness of activities)
  2. forecasts made
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

product moment correlation coefficient
PMCC

A

r
objective measure of strength of association between 2 variables

  • 1 = perfect negative correlation
    +1 = perfect positive correlation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

interpreting correlation

A
  1. high correlation doesn’t imply causal relationship could be due to other factors
  2. outliers - distort (outlier or influential) = change correlation
  3. small sample -> observed correlation is high but no association
  4. PMCC only measures linear
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 other causes of a high correlation

A
  1. coincidence (over time period both increase but no link)
  2. hidden third variable/lurking variable (influence both variables)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Bivariate regression

A

fitting a line through scatter of points on scatter diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

least squares criterion

A

best fitting line is one minimising sum of squared vertical direction from line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

residual or error

A

vertical deviation from the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

best fitting line represented by equation

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

interpolation

A

explanatory variable in data range = more reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

extrapolation

A

anything beyond data limits
falls outside of our observed points
- less reliable

  • assumption that same linear relationship applies may not be valid
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

coefficient of determination

A

r squared
measures goodness of fit
values from 0-1.0
1 = perfect fit
e.g. 0.817 = 81.7% of variation is explained

high value doesn’t guarantee obtained the best regression model -> just say model fits past data well (but could yield poor forecasts)

same as co-efficient of correlation

17
Q

significance of the regression line

A

two variables not related then β is zero

18
Q

population values of intercept and slope

A

intercept = α
slope = β

19
Q

t - test

A

null hypothesis
generates a p value
e.g. p = 0.001 -> statistically significant

20
Q

4 assumptions underpinning the significance test

A
  1. sum of errors is 0
  2. errors = normally distributed
  3. homoscedacity
  4. erros associated with any 2 observations are independent
  • test how well these are met be inspecting residuals of model
21
Q

homoscedacity

A

variance of error is same
irrespective of the value of independent variable

22
Q

inspecting the residuals of the regression model

A

see if assumptions appear to be met - useful to obtain plots of residuals

  1. histogram of residuals = reveal cant be normally distributed
  2. plotting residuals may reveal assumption of homoscedasticity is not valid
  3. plotting residuals against independent -> assumption of linearity is wrong
23
Q

bivariate regression analysis

A
  1. model assumes linear relationship between variables
  2. make forecasts = assume relationship observed previously will continue, over time underlying relationship may change
  3. only 1 variable used to forecast (others will also be associated)
  4. Large residual observation = outlier vs influential observation
24
Q

influential observation

A

KING KONG EFFECT
- large influence on line of best fit (if omitted from analysis = position of line = change)
- lie to extreme right or left of scatter away from bulk

  • draw regression line towards them
  • not large residuals therefore not outliers