Time series Flashcards
What are common sources of endogeneity
Omitted variables, Simultaneity, measurement error
What are omitted variables and what are they a source of?
When a statistical model leaves out one or more relevant variables. Omits an independent variable that is correlated with both the dependent variable and one or more of the independent variables. Source of endogeneity
What is simultaneity bias and what can it cause
Where the explanatory variable is jointly determined with the dependent variable (X causes Y, Y causes X). Source of endogeneity. Education determines wages but wages also determine future education
What is measurement error and what can it cause
Difference between a measured quantity and its true value. Source of endogeneity.
2 good examples of omitted variable bias in wage education
Education of individual’s parents,
Ability
Example of measurement error in wage education model
Not so much measurement but years does not take into account quality of education
what is a chi squared distribution mean and variance
mean is the degrees of freedom,
variance is the 2 x degrees of freedom
log-level what does β mean
100(β1) is the percentage change in y
log-log what is β
β is the percentage change
level-log what is β
∆=(β1/100)%∆x
what do you need for unbiased estimates
linear in parameters,
random sampling,
sample variation in explanatory variable,
zero conditional mean (E(u|x)=0)
what does unbiased mean
E(βhat)=β,
the sampling distribution of βhat is centred around β
what are the main assumptions for the main properties of OLS in matrix form
data generating process,
random sampling of n observations,
no perfect collinearity: matrix X of full (column) rank, rank k+1,
Zero conditional mean E(u|x1,…,xk)=0
what does —>p(above) and —>d(above) mean
- –>p is convergence in probability
- –>d is convergence in distribution
what is stationarity
stationary time series is a process whose probability distributions are stable over time
what is significant about the first-order autocovariances for the MA(1) model (yt=εt+αεt-1)
only first-oder autocovariance is nonzero
what is the strong exogeneity eassumption
zero conditional mean assumption E(ut|x)=0, imposes that the error at time t be uncorrelated with each explanatory variable in every time period
what can a model with a lagged dependent variable not satisfy
model with lagged dependent variable cannot satisfy strong exogeneity
what is weakly independent
yt and yt-j are ‘almost independent’ as j gets large
what is a stable AR(1) process
weakly dependent
what is serial correlation
when homoskedasticity doesn’t hold
what happens to OLS in the presence of serial correlation
OLS remains consistent, but becomes inefficient and its standard errors need to be adjusted
what happens to the Gauss-Markov property under serial correlation
Gauss-Markov requires homoskedasticity and serially uncorrelated standard errors, OLS is n longer BLUE in presence of serial correlation
what’s the difference between the test for serial correlation and the test for serial correlation without strong exogeneity
Do OLS regression of uthat on x1t,x2t,… and ut-1hat for all t as opposed to just uthat on ut-1hat
how do you adapt the test for serial correlation to tes for higher-order serial correlation (second order ut-2hat)
only need to add ut-2hat (,…ut-qhat) to the equation,
uthat=ρ1ut-1+ρ2ut-2+et,
Null: H0:ρ1=ρ2=,…,ρq=0
Then do F test to test joint significance of ρ1 and ρ2 simultaneously
In a random walk yt=θyt-1+et what makes it nonstationary
whenever |θ|>1, process yt has variance that goes to infinity and is nonstationary
where does the term unit root come from
called unit root as comes from the fact that θ=1 in AR(1) so t-1 is the root,
strong memory
when dealing with a unit root how do you transform it
when dealing with a unit root, first differencing turns a unit root process into a weakly dependent process.
It is then integrated of order one or I(1). Also called difference stationary
what does difference stationary mean
when first differencing turns process (for ex a unit root) into a weakly dependent process
what is the order of integration
the number of times the variable has to be differenced to arrive at a weakly dependent process
what order of integration is a weakly dependent process
if process weakly dependent, it is integrated of oder zero I(0)
what is the purpose of differencing to get weakly stationary
need to keep on differencing until mean, variance and covariance don’t depend on time
what does trend stationary mean
in a nonstationary model including a trend, if after removing the trend the resulting variable becomes stationary the variable is trend stationary
if a model is trend stationary what is the impact of a shock to yt
shock to yt over one period and yt returns to its trend value, if no further shocks
what is the impact of a shock if a model is difference stationary
difference stationary like random walk: a shock in period s has not only an impact on yt but also on yt+1 and yt+2 and so on
why can’t you do a t test when testing for unit roots
H0: |θ|=1 so process nonstationary so the standard results on the distribution of OLS are no longer valid,
t ratio does not have t distribution
what did Dickey and Fuller do
discovered unit root hypothesis could still be tested by t-type of statistics provided the critical values are appropriately adjusted
what did MacKinnon do
MacKinnon used computer simulations to calculate a ‘response surface’ for critical values of the Dickey-Fuller t tests. Can be used to compute critical values for any sample size
what’s the adjustment to the model when testing for unit roots with a constant but no trend
∆yt = c + γyt-1 + et,
γ=(θ-1)
what’s the adjustment to the model when testing for unit roots with a constant and trend
∆yt = c + γyt-1 + δt + et,
γ=(θ-1)
what is the augmented Dickey-Fuller
allows for serial correlation by having lagged first differences which absorb the serial correlation,
∆yt=γyt-1+β1∆yt-1+β2∆yt-2+…+βp∆yt-p+et
what are some problems with the Dickey-Fuller test
low power (error of accepting null when alternative true), 'Near' unit root, not good at testing for values such as θ=0.98, Structural breaks - presence of structural breaks in series, if ignored, lead to null of difference stationary being wrongly accepted
what is a spurious regression
not what it purports to be, false or fake.
Find statistically significant relationship betw xt and yt which is spurious is xt and yt are unrelated
what are the consequences for OLS of heteroskedasticity
remains unbiased, consistent and aymptotically normally distributed,
variance different so s.e. that were valid will lead to invalid inference (eg t test not correct size and confidence intervals not correct),
OLS no longer efficient
what does the AR(1) correlogram do
decay slowly to 0
what does the MA(1) correlogram do
drops to 0 after one period
what does weak dependence say that’s different to weak stationarity
adds that correlation goes to 0 as j->∞,
Corr(yt,yt+j) -> 0 as j->∞
what does weak dependence allow for
contemporaneous exogeneity as opposed to strong, E(ut|xt)=0 only for t, mean of ut doesn’t have to be zero for past values of regressors -> same for variance cst conditional on xt only
what is contemporaneous exogeneity
E(ut|xt)=0 only for t, mean of ut doesn’t have to be zero for past values of regressors, same for variance cst conditional on xt only
when the regressors are lagged so θyt-1 does it satisfy strong exogeneity
when regressor lagged it never satisfies strong exogeneity so need weak dependence –> contemporaneous exogeneity needed
what is the implication of serial correlation on OLS
consistency preserved (most of the time), usual standard errors wrong and efficiency lost
if strong exogeneity does not hold, what must you do when testing for serial correlation
must include all regressors in auxiliary regression
what is the main issue with some time series
when yt is not stationary and weakly dependent, quite typical for many economic variables
what is a leading case of a time series that is not stationary or weakly dependent
random walk where var grows over time var(yt)=tσ^2
what is another example of a time series that is not stationary or weakly dependent that is not a random walk
linear model with a trend, where mean of yt varies over time
how do you deal with a linear model with a trend
de-trend it
what can cause spurious regression
nonstationarity
what is a random walk (not d)
unit root (highly persistent time series)
why do we need to difference a unit root process
the process is nonstationary and inference based on usual OLS ses is invalid, inference on differenced model is then valid
why can’t you use t stat for DF test, why do we have to use different CVs
null that λ=0 so θ=1 and there is nonstationarity so OLS doesn’t work
what are the assumptions needed for DF
error term homoskedastic and not serially correlated
how do you deal with serial correlation for DF
augmented where introduce lags to absorb serial correlation