CROSS-SECTIONAL vs. TIME-SERIES DATA REGRESSION ANALYSIS WITH TIME-SERIES DATA AUTOCORRELATION Flashcards

1
Q

a) What is the difference between time series and cross sectional data?
b) What variables are related in time series data?
c) What are the relations of successive variables in the time series data called?
d) What are time series observations thought of as outcomes of random processes?

A

a) Differences:
1. Time series data is temporal ordering (chronological ordering)
2. Cross sectional data: Population X and a random sample n drawn from it
Time series data: Population X1, X2… X3 and one observation drawn from each (E(X_t) = mu and Var(X_t) = standard deviation^2)
3. First is static (cross sectional), (time series) second is dynamic but can also be static (time series)

b) Past, current and future variables are all related
c) The successive observations of a time series data are likely to be related with each other. This kind of correlation is called autocorrelation or serial correlation referring to degree of association between values of a single linear variable observed in different time periods

d) Time series observations are thought of outcomes of random processes because:
i) they likely depend on many unobservable or unobserved factors
ii) most likely contain some measurement error
iii) hence they can hardly be predicted with full certainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Because of peculiar nature of time series data, we need to reconsider some assumptions we made in week 7 about regression analysis based on cross sectional data. What are these assumptions?

A
  1. LR4: Conditional covariance between any 2 standard errors is 0 -> TSLR4: conditional on the independent variables.
  2. The random errors in any two time periods are uncorrelated
  3. To keep the desirable properties of the OLS estimators, random errors must be stationary
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is regression in the cross sectional data static?
When is regression in time series static?
Are static models realistic?

A

all observations belong to the same time period and hence the alleged relationship between the dependent and the independent variables is contemporaneous.

b) Time series regressions are also static when the current value of the dependent variable is modelled exclusively with the current value(s) of the independent variable(s), e.g
c) Static models are not realistic because they are often too restrictive and unrealisitic because real life models are often dynamic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When a variable is likely determined by its own history, what model is used?
When variables are supposed to depend on current and past values of another variable what kind of model is used?
What is combination of both models called?

A

a) the appropriate model is an autoregressive (AR) model:
For example, the current value of a macroeconomic variable, like GDP or the rate of unemployment, can be often predicted fairly well from its own past, i.e., from its own lagged values.

b) the appropriate model is a DL model
For example, one might model the general fertility rate in year t(children born per 1000 women of childbearing age in t) as a function of the current and lagged real dollar values of the tax exemptions related to having children (i.e. private health insurance rebate, net medical expenses tax offset, Medicare levy surcharge).

c) combination of both models are called the ARDL model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can autocorrelation become a problem and what TSLR does it violate?
what is the first form serial corrlation can take?
What happens when p<0 and p>0

A

It can become a problem in time series data when the variables have some natural order and it violates TSLR4 (Conditional on the independent variables,
the random errors in any two different time periods are uncorrelated)

b) First order serial correlation form
stochastic term meets all standard assumptions
p=p_(standard error t, standard error(t-1)) - the first autocorrelation coefficient
measures the relationship between current and previous error term, (-1,1)

p>0 (positive first order autocorrelation) most points in first and third quadrant (standard error changes infrequently)

p<0 (negative first-order autocorrelation) second and fourth quadrant
standard error changes frequently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In an otherwise correctly specified regresison model autocorrelation has similar consequences ot heteroskedasticity

A

OLS estimators of betas are still linear, unbiased and consistent
However,
OLS estimators of betas are not the best (i.e. efficient) estimators anymore
Their variances cannot be estimated with the standard formulas
Their standard formulas often underestimate the standard errors

t ratios overestimated
F statistic highly unreliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we measure serial correlation?

What is the Durbin-Watson test?

A

bc standard error term cannot be observed, we study the OLS residual

i. Plot e_t against time (t) or against e_(t-1)
ii. Perform some formal test like Durbin-Watson d test

b) Durbin-Watson test can be used to detect first order serial correlation in estimated regression models might violate TSLR4 but satisfy other classical assumptions, including normality

DW test cna be one-sided or two sided
H0: p=0 and HA: p not equal to 0

or H0: p=0 and HA: p<0
or H0: p=0 and HA: p>0
first correlation is positive when model is correctly specified
negative when model misspecified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the DW test statistic? WHta does it depend on?

What does the tabulated critical value depend on ?

A

d is equal to around 2(1 - r_(et,et-1) -> 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What test would we use to detect higher order serial correlation?
What are the steps? (similar to White test)

A

Breusch-Godfrey LM (language multiplier test), can detect higher order serial correlation in the residuals
H0: p1=p2=…p_q = 0
HA: not all pis are equal to 0

i. Estimate OLS for model and obtain the residuals ei
ii. Regress the OLS residuals et on the original X’s and on the lagged residuals (lagged by 1,2,…, q) and obtain the coefficient of determination from auxiliary regression

iii. Under H0, “ no autocorrelation up to order q” and for large n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

List out the conclusions for the Durbin Watson test (eg what happens if d>dl or d

A

If DW is about 0.09 <1.65 (eg) (=dL) at the 10 % significance level we reject H0 and conclude that there is first order autocorrelation. Or take a look at the reported p-value. It is practically 0, so H0 can be rejected at any reasonable significance level

The p-value is practically 0, so the BG test rejects the null hypothesis of no first-order and second order serial correlation at any reasonable level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What to do when an otherwise correclty specified model suffers form autocorrelation?

A

Since OLS estimator are unbiased, no need to do anything if autocorrelation is relatively weak (say |p| <0.3) the estimated coefficients seem to be reasonable and you do not intend to rely on the OLS t-scores for model specification

Estimate an appropriately transformed equation that has original parameters but independent errors
-> Generalised least squares method
Like WLS, GLS is LS procedure applied on a transformed model that unlike the original model, satisfies assumption TSLR4 as well

iii. If uncertain whether random error in model is serially uncorrelated or not, if the order of serial correlation is unknown, you can use heteroskedasticity and autocorrelation consistent (HAC) standard errors of Newey and West

How well did you know this?
1
Not at all
2
3
4
5
Perfectly