HANDOUT 2 Flashcards
Serial correlation =
the presence of some form of linear dependence over time for some series Zt.
Auto-Correlation function =
a pictorial representation of this linear dependency over time. It measures the correlation between Zt and Zt-k for different values of k.
Corr(Zt, Zr-k) formula
= COV(Zt, Zt-k)/sqrt[V(Zt) V(Zt-k)]
V(Zt) = V(Zt-k) = gamma 0 by stationarity
COV(Zt, Zt-k) = gamma k since only depends on distance apart by stationarity.
So Corr = gamma k / gamma 0 = pk
If p=1 this means
Shock today, on a new path & never return to old equilibrium as never forget about the shock.
If p=0 this means
Shock today, next period no memory of it = immediate adjustment back to equilibrium.
A change in X1t on Yt in a model with no lags means…
change in X1t only affects Y today
At t+1, immediately adjust back to equilibrium.
White noise process formula
Zt = €t
E(Zt) for white noise
E(Zt) = 0
V(Zt) for white noise
V(Zt) = V(€t) = sigma^2
COV(Zt, Zt-k) for white noise
= 0 since E(€t.€t-k)=0
P1 to Pk for white noise
P0 = 1; all Pk = 0
ACF for white noise
Shock at period 0, immediately return back to equilibrium by period 1.
What does an ACF show?
The proportion of a shock remaining k periods later. The correlation compared to period 0 when the shock hits.
AR1 model formula
Zt = phi Zt-1 + €t
one lag of dependent variable
What condition must we impose on phi for AR1 and why?
I phi I < 1
For stationarity - otherwise shock will never dissipate out of system.
E(Zt) for AR1
E(Zt) = phi E(Zt-1) + E(€t)
E(Zt) = E(Zt-1) by stationarity
(1-phi)E(Zt) = 0
Since phi<1, must be E(Zt) = 0
V(Zt) for AR1
V(Zt) = phi^2 V(Zt-1) + V(€t) + 2COV(Zt-1, €t)
V(Zt) = V(Zt-1) by stationarity
(1 - phi^2)V(Zt) = sigma^2
V(Zt) = sigma^2 / (1 - phi^2)
COV(Zt, Zt-1) for AR1
E(Zt.Zt-1) as zero mean E[Zt-1(phiZt-1 + €t)] = phi V(Zt) gamma 1 = phi gamma 0 = phi sigma^2 / (1 - phi^2)
Corr(Zt, Zt-1) for AR1
Corr = gamma 1 / gamma 0 = Phi
P2 for AR1
P2 = Phi^2
Pk for AR1
Pk = Phi^k
2 shapes for AR1 look like
In both cases mod Phi< 1
If Phi>0, smooth decay to 0
If Phi<0, oscillations by still end up at 0.
Roots for AR1 using lag operator
(1 - phiL)Zt = €t
Solve 1 - phi L = 0
V = L^-1
V = phi - AR1 has 1 root
Why when solving for roots, do we solve for L^-1 and not L?
Because if we solved for L, the root would > 1 so wouldn’t be stationary.
AR2 model formula
Zt = phi1 Zt-1 + Phi2 Zt-2 + €t
Roots for AR2
1 - phi1 L - phi2 L^2 = 0
V = L^-1; V^2 = L^-2
V^2 - phi1 V - phi2 = 0
solve by quadratic formula - 2 roots for AR2
E(Zt) for AR2
E(Zt) = 0
V(Zt) for AR1
gamma 0 = phi1 gamma 1 + phi2 gamma 2 + sigma^2
Yule walker equations x 2 for AR
- gamma 1 = phi1 gamma 0 + phi2 gamma1
2. gamma 2 = phi1 gamma 1 + phi2 gamma0
How many possible ACFs for AR2?
4
P1 = Corr(Zt, Zt-1) for AR2
phi1 / (1 - phi2)
MA1 model formula
Zt = theta €t-1 + €t
Average of 2 shocks
1 period memory only
What conditions do we need on theta for MA1? Why?
NO conditions - an MA model is stationary by definition.
E(Zt) for MA1
0
V(Zt) for MA1
(1 + theta^2) sigma^2
COV(Zt, Zt-1) for MA1
E[(€t + theta €t-1)(€t-1 + theta €t-2)]
All cross-products 0; 1 term in common
gamma 1 = theta sigma^2
P1 for MA1
theta / (1 + theta^2)
P2–>Pk for MA1
all 0 - 1 period memory only
An AR1 can be written as…
AR1 = MA(infinity)
Write AR1 as an MA(infinity)
(1 - phiL)Zt = €t Zt = 1/(1-phi L) €t a = 1, r = phi L 1 + phi L + (phi L)^2 + (phi L)^3 ... Multiply by €t Zt = €t + phi €t-1 + phi^2 €t-2 + phi^3 €t-3 = infinite MA process
How do ACF and PACF differ?
ACF - always compare to Zt
PACF - what’s the additional effect of Zt-j, holding all lags before constant. Want to see direct effect over and above all other lags.
How do we determine Ps for PACF?
regress Zt on Zt-1 & plot P11
regress Zt on Zt-1 & Zt-2, plot P22 only
regress Zt on Zt-1, Zt-2, Zt-3, plot P33 only
- These are partial effects
What does the PACF for AR look like?
AR(j) = j non-zero terms; all else zero.
Why is determining PACF for AR easy?
Because we already have Zt as a function of Zt-1 etc.
If we have an AR2, how do we find P11?
P11 = regression of Zt on Zt-1 only but we also have Zt-2.
Standard omitted relevant variable formula:
P11 = phi 1 + phi2[cov(Zt-1, Zt-2)/Var(Zt-1)]
Why is it harder to find PACF for MA?
Because we do NOT already have Zt as a function of Zt-1 etc. Need to get rid of €t.
Use lag operator to find PACF coefficients for MA1
Zt = €t + theta €t-1 Zt=(1+thetaL)€t €t = 1/(1+thetaL)Zt - a=1, r=-thetaL €t = Zt - theta Zt-1 + theta^2 Zt-2 - etc Zt = theta Zt-1 - theta^2 Zt-2 ... + €t P11 = theta + bias P22 = theta^2 + bias etc.
How are AR1 and MA1 similar in terms of PACF and ACF?
ACF MA = PACF AR
ACF AR = PACF MA
When we have serial correlation but NO lagged dependent variable, is OLS unbiased?
YES - we only need E(€t I X) = 0 for unbiasedness.
So what is the problem with OLS when we have serial correlation but NO lagged dependent variable?
The variance estimates are wrong
Because the cross-products COV(€t, €s)≠0
So OLS gives wrong SE = wrong t-ratios = all hypothesis testing is wrong.
V(b1) formula when we have no lagged dependent variable but serial correlation
V(b1) = sigma^2 sum Wt^2 + 2 sum t=1,…,T
sum s=t+1,…,T [WtWs gamma s-t]
Solution to issue with serial correlation with NO lagged dependent variable
Use “Newey-West’s Heteroscedastic Autocorrelation Consistent Standard Errors” (HACSE)
When we have serial correlation AND lagged dependent variable, is OLS unbiased?
NO: COV(Yt-1, €t-1)≠0, COV(€t, €t-1)≠0 if we have serial correlation; so COV(Yt-1, €t)≠0
E(b1) formula when we have a lagged dependent variable and serial correlation
E(b1) = B1 + COV(Yt-1, €t) / Var(Yt-1)
SO: when we have a lagged dependent variable AND serial correlation, is OLS any good?
NO - it is BIASED & INCONSISTENT
2 tests for detecting serial correlation
- Breusch-Godfrey test
2. Durbin Watin Statistic
What does the breusch-godfrey test assume?
We assume that we know the form of the serial correlation: €t = phi1 €t-1 + phi2 €t-2 etc. + Rt (well-behaved error term)
€t is unobserved so we use…
the residuals et
Step 1 for breusch-godfrey test
Estimate the original equation by OLS and save residuals:
et = yt - (b0 + b1x1t + b2x2t)
Why can we use OLS for step 1 of breusch-godfrey test?
Because we test under H0 so we assume no serial correlation hence our OLS coefficients are OK.
Step 2 for breusch-godfrey test
regress residuals on lagged residuals
et = sum j=1,…,p phi j et-j
What is the DOF problem with breusch-godfrey test?
We do NOT have n observations on our residuals so Stata gets the DOF wrong.
Because we have lags, there are missing values. If we have et-p lags, we only have t-p observations.
How do we solve the DOF problem for breusch-godfrey test?
regress residuals on lagged residuals AND an intercept & original set of explanatory variables.
DOF for breusch-godfrey test
DOF = (T - P) - (P + no parameters from intercept & original explanatory variables)
If we add an additional lagged explanatory variable into the model, how does this affect DOF for breusch-godfrey test?
Each additional lag –> DOF falls by 2.
Because we lose an observation + gain an extra restriction from including the explanatory variable in the test regression.
2 test statistics for breusch-godfrey test
- the usual F-test
2. Lagrange multiplier test
F-test statistic for breusch-godfrey test
F = [(RssR - RssU) / p] / [RssU / [(T-P) - (P + parameters from original model)]
H0 for breusch-godfrey test
H0: all Phi j = 0 - no serial correlation
H1: any Phi j ≠ 0 - serial correlation
lagrange multiplier test for breusch-godfrey test
LM = (T - P) R^2
T - P = no observations on residuals
R^2 = from auxiliary regression i.e. our et = test
What value of P should we choose i.e. how many lags?
p = 1 annual data
p=4 quarterly data
p=12 monthly data (but should be told in Q)
How does stata do the F-stat for breusch-godfrey test?
F = chi^2 g /g / chi^2 k / k
As k–> infinity, F –> Chi^2 g/g
So stata does F = chi^2 p / p = LM / p
Problem with breusch-godfrey test
low powered since we assume form of serial correlation = we often find no serial correlation when there actually might be.
Durbin watson statistic
DW = sum(et - et-1)^2 / sum et^2
DW approx = 2(1 - phi 1)
where phi 1 = coefficient on et-1 when reg et on et-1.
H0 for DW and DW value under H0
H0: phi1 = 0 –> DW = 2
H1 for DW and DW value under H1
H1: Phi ≠ 0
Phi>0 –> 1, DW –> 0
Phi<0 –> -1, DW –> 4
So DW takes values between
0 < DW < 4
When do we reject H0 for DW?
- if test stat is between 0 and lower CV (+VE serial correlation)
- or if test stat is between 4 and 4-dL (-VE serial correlation)
Inconclusive regions for DW
- test stat between dL and dU
2. test stat between 4-dU and 4-dL
2 regions for do not reject for DW
- test stat between dU and 2
2. test stat between 2 and 4-dU
2 problems with DW test
- low powered
2. including a lagged dependent variable makes DW biased towards 2 i.e. biased towards accepting H0.
What test do we do instead of DW if we have a lagged dependent variable?
Durbin’s h test
h = phi sqrt[n / 1 - nS^2]
S^2 = OLS variance estimate for lagged dependent.
phi = 1st order autocorrection coefficient.
CVs from what distribution for durbin’s h test?
N(0,1) normal distribution
If a question says “data on variables is available from…” what does this mean?
This period has not accounted for missing values due to lags. So the period we actually can estimate the model over will be smaller due to lags.
If a question says “the model is estimated over…” what does this mean?
This period already takes into account the lags so it has been reduced from the initial data collection period. If we have 3 lags on X1t, this means that data must’ve been collected over the period + 3 quarters (?) before as well.
Apparent serial correlation can be caused by?
An omitted relevant variable Means COV(Vt, Vt-1) ≠ 0 for false model error term since it contains the omitted variable.