Econometrics - Panel/TS & hetero/autocorrelation Flashcards

1
Q

What is heteroskedasticity and why is it a problem?

A

Non-constant variance in errors - violates a classical assumption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 6 reasons heteroskedasticity occurs?

A
  1. Error-learning models
  2. Real income grows through time
  3. Improved data collection over time
  4. Outliers in a sample of data
  5. An incorrectly specified model
  6. Skews in distribution in an X variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the consequences of heteroskedasticity?

A

No longer minimum variance so inefficient estimator, not BEST anymore so there can be another estimator that can produce smaller variance, there will be a breakdown in inference (std errors no longer unbiased so issues w t-tests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 3 main tests for heteroskedasticity?

A
  1. Goldfeld-Quandt Test (GQ test)
  2. Breusch-Pagan Test (BP test)
  3. White’s Test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does the Breusch-Pagan test work?

A

Fit regression, then calculate the squared residuals, and fir a new model using the squared residuals, then calculate the chi-square test stat and p-value, and compare to sig level. Null hypothesis is homoskedasticity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can you correct for heteroskedasticity?

A
  1. Transform model e.g. logs/squares/inverse
  2. Robust standard errors
  3. Generalised Least Squares/Weighted Least Squares (GLS/WLS)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is autocorrelation/serial correlation and why is it a problem?

A

Errors correlated with their previous value - violates classical assumption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When does serial correlation occur?

A
  1. Time-series data
  2. Spatially organised data
  3. Can be in cross section but less common
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What causes serial correlation?

A
  1. Omitted lagged variables
  2. Economic shocks that have persistent effects
  3. Transformations applied to data
  4. Model misspecification
  5. Error term being truly dynamic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is first-order autocorrelation?

A

Assume that the errors is correlated linearly only with its value in the previous period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the consequences of autocorrelation?

A

Residuals don’t have minimum variances so OLS isn’t BLUE.
R-squared may seem high
Standard errors may be baised downwards - OLS is inefficient and incorrect inferences may be made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the Durbin Watson Test test for?

A

First-order correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What values of the Durbin Watson test statistic indicate first-order autocorrelation?

A

DW -> 0 = positive autocorrelation
DW -> 2 = no autocorrelation
DW -> 4 = negative autocorrelation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 3 limitations of the Durbin Watosn test?

A
  1. Not valid in dynamic models as test stat biased to 2
  2. Only applies to first-order autocorrelation
  3. Bounds test doesn’t offer exact critical values so an element of doubt
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a better test for serial correlation?

A

Bresuch-Godfrey Lagrange Multiplier Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is a Bresuch-Godrey LM test conducted?

A

Estimate OLS and obtain residuals, estimate auxiliary regression and then either compute LM test stat and compare to ch-squared dist OR use an F-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What can be done to correct for serial correlation?

A

Employ robust standard errors (HAC standard errors)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What do hetertoskedasticity autocorrelated consistent (HAC) standard errors do?

A

Larger standard errors so less statistical significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the difference between a True and Natural experiment

A

True = observations randomly assigned to different groups
Natural = not randomly assigned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the equations for a Difference-in-Difference model?

A

Y = b0 + b1Gi + b2Ri + b3(G.R) + error, where G=1 in treatment group, otherwise 0, and R=1 if observation is observed in period 2, otherwise 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the interpretation of b3 in a general D-in-D?

A

Average treatment effect (ATE), captures the policy effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the advantages of using panel data? (5)

A
  1. More information, more variability, less collinearity, greater degrees of freedom - hence more efficient
  2. Consider dynamic changes
  3. Detect/measure effects that can’t be observed in other data types
  4. Better model specific types of economic behaviour
  5. Large panels less likely to produce biased estimates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 4 main types of panel models?

A
  1. Pooled OLS
  2. Fixed Effects Least Squares Dummy Variable Model (LSDV)
  3. Fixed Effects Within-Group Model
  4. Random Effects Model
24
Q

Explain pooled OLS?

A

Pools data and estimates simple OLS - disregards time and entity dimensions

25
Q

Explain LSDV model?

A

Pools data and gives each entity its own intercept dummy

26
Q

Explain fixed effects (within group) ?

A

Each entity given its own intercept, but each variables is expressed as deviation from mean value

27
Q

Explain random effects model?

A

Assumes intercepts are random draws from a bigger population

28
Q

Problems of LSDV model?

A
  1. Time-invariant - can’t consider variables over time
  2. Too many dummies can reduce degrees of freedom
  3. Lots of dummies = multicollineaity
  4. Can have issues with error term
29
Q

How is random effects different from fixed effects?

A

Random effects doesn’t estimate an individual effect for each observation, it estimates an overall estimate of the intercept that captures the average effect within the sample of data

30
Q

What are the 2 error terms in a random effects model?

A
  1. cross section/individual specific error component
  2. combine time series and cross section error comportment - idiosyncratic error term (always in a regression)
31
Q

When should REM be used over FEM?

A

If you think differences across firms influence dependent variables

32
Q

What does the Hasuman test test for? What is the null/alternate hypothesis?

A

Tests whether the unique errors are correlated with the regressors. The null hypothesis is that there is no correlation and so estimates are consistent and REM is preferred.

33
Q

What type of estimator is used for REM and why?

A

Generalised Least Squares (GLS) - cannot use OLS as would yield inefficient estimators.

34
Q

What models can be employed if you have a binary/limited dependent variable?

A
  1. Linear Probability Model
  2. Logit Model
  3. Probit Model
  4. Tobit Model
35
Q

What type of estimation the is Probit Model?

A

Maximum-Likelihood estimator

36
Q

What is the interpretation of a Probit Model?

A

Have to calculate marginal effects and the results are a change in probability, in software marginal effects are the slope value

37
Q

What is the interpretation of a Logit Model?

A

Coefficients are a partial slope coefficient, the measure change in logit for a unit change in X

38
Q

How do you decide between a Logit and a Probit?

A
  1. Measure of fit, R-squared
  2. Hypothesis test
  3. Model interpretation (marginal effects)
39
Q

What is a Tobit Model used for?

A

Limited dependent variables, where the value is continuous but is cut-ff/censored at a particular value

40
Q

How does the Tobit Model work?

A

Uses maximum likelihood estimation that treats the cutoff/censor values differently

41
Q

What is the general equation for a simultaneous equation model?

A
42
Q

How do you estimate a reduced form equation?

A

Use the relationship between the 2 simultaneous equations e.g. Qd = Qs and rearrange to get a reduced form equation. Treat as a normal simultaneous equation

43
Q

What is the problem with using OLS with simultaneous equation models? What is the solution?

A

Simultaneity bias, use two stage least squares (2SLS) instead

44
Q

How do you conduct a 2SLS

A

Identify exogenous and endogenous variables, use an instrumental variables approach where regressors = endogenous variables and the instruments = exogenous variables

45
Q

What makes a good instrument?

A

Z (insturment) must be exogenous in the equation , but related to X (endogenous variable)

46
Q

What is the difference between static and dynamic time series data?

A

Static =change in X at time t has an immediate effect on y e.g. Phillips curve
Dynamic = where change in X at time t doesn’t have an immediate effect on y, lags are included in the model to account for the time it takes for the change in X to be full absorbed by Y

47
Q

In TS data - what additional classical assumption is needed?

A

Strict exogenity = for each time period the expected value of the error term given all explanatory variables for all time periods is 0

48
Q

What problem will you have if stricy exogeneity is not achieved?

A

biased OLS estimates

49
Q

How can you prove consistency?

A

Stationarity and weak dependence

50
Q

What does stationarity mean?

A

Probability distribution is stable over time - the statistical properties of a process generating a time series do not change over time. (the series changes over time but the WAY it changes does not itself change over time)

51
Q

What are 2 problems with dynamic models?

A
  1. High mutlicollinearity
  2. Loss of degrees of freedom
52
Q

What is an autoregressive model? Why is it used?

A

Replace lagged values of independent variables with lagged dependent variables. Used to mitigate issues with dynamic models

53
Q

What happens when TS data is not weakly dependent?

A

Random walk - which is highly persistent and non-stationary

54
Q

What is it called when you have highly persistent TS data with a trend?

A

Random walk with drift

55
Q

How can you make a problematic series stationary and have weak dependence?

A

First-differencing the data series

56
Q

What model can you employ if you have serial correlation in TS?

A

Feasible Generalized Least Squares

57
Q

How do you test for a unit root? What is the null/alternate hypothesis?

A

Dickey-fuller test (uses special DF critical values)
H0 = there is a unit root, data is non-stationary
H1 = there is not a unit root, the data is stationary