Econometrics - Panel/TS & hetero/autocorrelation Flashcards by Emily Morris

What is heteroskedasticity and why is it a problem?

Non-constant variance in errors - violates a classical assumption

How well did you know this?

Not at all

Perfectly

What are 6 reasons heteroskedasticity occurs?

Error-learning models
Real income grows through time
Improved data collection over time
Outliers in a sample of data
An incorrectly specified model
Skews in distribution in an X variable

How well did you know this?

Not at all

Perfectly

What are the consequences of heteroskedasticity?

No longer minimum variance so inefficient estimator, not BEST anymore so there can be another estimator that can produce smaller variance, there will be a breakdown in inference (std errors no longer unbiased so issues w t-tests)

How well did you know this?

Not at all

Perfectly

What are the 3 main tests for heteroskedasticity?

Goldfeld-Quandt Test (GQ test)
Breusch-Pagan Test (BP test)
White’s Test

How well did you know this?

Not at all

Perfectly

How does the Breusch-Pagan test work?

Fit regression, then calculate the squared residuals, and fir a new model using the squared residuals, then calculate the chi-square test stat and p-value, and compare to sig level. Null hypothesis is homoskedasticity.

How well did you know this?

Not at all

Perfectly

How can you correct for heteroskedasticity?

Transform model e.g. logs/squares/inverse
Robust standard errors
Generalised Least Squares/Weighted Least Squares (GLS/WLS)

How well did you know this?

Not at all

Perfectly

What is autocorrelation/serial correlation and why is it a problem?

Errors correlated with their previous value - violates classical assumption

How well did you know this?

Not at all

Perfectly

When does serial correlation occur?

Time-series data
Spatially organised data
Can be in cross section but less common

How well did you know this?

Not at all

Perfectly

What causes serial correlation?

Omitted lagged variables
Economic shocks that have persistent effects
Transformations applied to data
Model misspecification
Error term being truly dynamic

How well did you know this?

Not at all

Perfectly

What is first-order autocorrelation?

Assume that the errors is correlated linearly only with its value in the previous period

How well did you know this?

Not at all

Perfectly

What are the consequences of autocorrelation?

Residuals don’t have minimum variances so OLS isn’t BLUE.
R-squared may seem high
Standard errors may be baised downwards - OLS is inefficient and incorrect inferences may be made

How well did you know this?

Not at all

Perfectly

What does the Durbin Watson Test test for?

First-order correlation

How well did you know this?

Not at all

Perfectly

What values of the Durbin Watson test statistic indicate first-order autocorrelation?

DW -> 0 = positive autocorrelation
DW -> 2 = no autocorrelation
DW -> 4 = negative autocorrelation

How well did you know this?

Not at all

Perfectly

What are 3 limitations of the Durbin Watosn test?

Not valid in dynamic models as test stat biased to 2
Only applies to first-order autocorrelation
Bounds test doesn’t offer exact critical values so an element of doubt

How well did you know this?

Not at all

Perfectly

What is a better test for serial correlation?

Bresuch-Godfrey Lagrange Multiplier Test

How well did you know this?

Not at all

Perfectly

How is a Bresuch-Godrey LM test conducted?

Estimate OLS and obtain residuals, estimate auxiliary regression and then either compute LM test stat and compare to ch-squared dist OR use an F-test

How well did you know this?

Not at all

Perfectly

What can be done to correct for serial correlation?

Employ robust standard errors (HAC standard errors)

How well did you know this?

Not at all

Perfectly

What do hetertoskedasticity autocorrelated consistent (HAC) standard errors do?

Larger standard errors so less statistical significance

How well did you know this?

Not at all

Perfectly

What is the difference between a True and Natural experiment

True = observations randomly assigned to different groups
Natural = not randomly assigned

How well did you know this?

Not at all

Perfectly

What is the equations for a Difference-in-Difference model?

Y = b0 + b1Gi + b2Ri + b3(G.R) + error, where G=1 in treatment group, otherwise 0, and R=1 if observation is observed in period 2, otherwise 0

How well did you know this?

Not at all

Perfectly

What is the interpretation of b3 in a general D-in-D?

Average treatment effect (ATE), captures the policy effect

How well did you know this?

Not at all

Perfectly

What are the advantages of using panel data? (5)

More information, more variability, less collinearity, greater degrees of freedom - hence more efficient
Consider dynamic changes
Detect/measure effects that can’t be observed in other data types
Better model specific types of economic behaviour
Large panels less likely to produce biased estimates

How well did you know this?

Not at all

Perfectly

What are the 4 main types of panel models?

Study These Flashcards

Pooled OLS
Fixed Effects Least Squares Dummy Variable Model (LSDV)
Fixed Effects Within-Group Model
Random Effects Model

Explain pooled OLS?

Study These Flashcards

Pools data and estimates simple OLS - disregards time and entity dimensions

Explain LSDV model?

Pools data and gives each entity its own intercept dummy

Explain fixed effects (within group) ?

Each entity given its own intercept, but each variables is expressed as deviation from mean value

Explain random effects model?

Assumes intercepts are random draws from a bigger population

Problems of LSDV model?

1. Time-invariant - can't consider variables over time 2. Too many dummies can reduce degrees of freedom 3. Lots of dummies = multicollineaity 4. Can have issues with error term

How is random effects different from fixed effects?

Random effects doesn't estimate an individual effect for each observation, it estimates an overall estimate of the intercept that captures the average effect within the sample of data

What are the 2 error terms in a random effects model?

1. cross section/individual specific error component 2. combine time series and cross section error comportment - idiosyncratic error term (always in a regression)

When should REM be used over FEM?

If you think differences across firms influence dependent variables

What does the Hasuman test test for? What is the null/alternate hypothesis?

Tests whether the unique errors are correlated with the regressors. The null hypothesis is that there is no correlation and so estimates are consistent and REM is preferred.

What type of estimator is used for REM and why?

Generalised Least Squares (GLS) - cannot use OLS as would yield inefficient estimators.

What models can be employed if you have a binary/limited dependent variable?

1. Linear Probability Model 2. Logit Model 3. Probit Model 4. Tobit Model

What type of estimation the is Probit Model?

Maximum-Likelihood estimator

What is the interpretation of a Probit Model?

Have to calculate marginal effects and the results are a change in probability, in software marginal effects are the slope value

What is the interpretation of a Logit Model?

Coefficients are a partial slope coefficient, the measure change in logit for a unit change in X

How do you decide between a Logit and a Probit?

1. Measure of fit, R-squared 2. Hypothesis test 3. Model interpretation (marginal effects)

What is a Tobit Model used for?

Limited dependent variables, where the value is continuous but is cut-ff/censored at a particular value

How does the Tobit Model work?

Uses maximum likelihood estimation that treats the cutoff/censor values differently

What is the general equation for a simultaneous equation model?

How do you estimate a reduced form equation?

Use the relationship between the 2 simultaneous equations e.g. Qd = Qs and rearrange to get a reduced form equation. Treat as a normal simultaneous equation

What is the problem with using OLS with simultaneous equation models? What is the solution?

Simultaneity bias, use two stage least squares (2SLS) instead

How do you conduct a 2SLS

Identify exogenous and endogenous variables, use an instrumental variables approach where regressors = endogenous variables and the instruments = exogenous variables

What makes a good instrument?

Z (insturment) must be exogenous in the equation , but related to X (endogenous variable)

What is the difference between static and dynamic time series data?

Static =change in X at time t has an immediate effect on y e.g. Phillips curve Dynamic = where change in X at time t doesn't have an immediate effect on y, lags are included in the model to account for the time it takes for the change in X to be full absorbed by Y

In TS data - what additional classical assumption is needed?

Strict exogenity = for each time period the expected value of the error term given all explanatory variables for all time periods is 0

What problem will you have if stricy exogeneity is not achieved?

biased OLS estimates

How can you prove consistency?

Stationarity and weak dependence

What does stationarity mean?

Probability distribution is stable over time - the statistical properties of a process generating a time series do not change over time. (the series changes over time but the WAY it changes does not itself change over time)

What are 2 problems with dynamic models?

1. High mutlicollinearity 2. Loss of degrees of freedom

What is an autoregressive model? Why is it used?

Replace lagged values of independent variables with lagged dependent variables. Used to mitigate issues with dynamic models

What happens when TS data is not weakly dependent?

Random walk - which is highly persistent and non-stationary

What is it called when you have highly persistent TS data with a trend?

Random walk with drift

How can you make a problematic series stationary and have weak dependence?

First-differencing the data series

What model can you employ if you have serial correlation in TS?

Feasible Generalized Least Squares

How do you test for a unit root? What is the null/alternate hypothesis?

Dickey-fuller test (uses special DF critical values) H0 = there is a unit root, data is non-stationary H1 = there is not a unit root, the data is stationary

Econometrics - Panel/TS & hetero/autocorrelation Flashcards

(57 cards)