11 Flashcards

1
Q

What are models that allow us to measure in two-way relationships?

A

Simultaneous equation models - – a model where several variables can have a cyclical relationship with each other.
Vector autoregressive (VAR) models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Important concepts for simultaneous equations

A
  1. Jointly determined – there is a two-way causal relationship between the variables (e.g., chicken produces an egg, which produces a chicken, ad infinitum). This is also often called a feedback effect, feedback loop, or dual causality.
  2. Endogenous variables – variables that are simultaneously determined, have a feedback loop (i.e. 𝑌𝑠).
  3. Exogenous variables – variables that are NOT simultaneously determined (i.e. 𝑋𝑠), but which are important as controls.
  4. Structural equations – characterise the underlying economic theory behind each endogenous variable by expressing it in terms of both endogenous and exogenous variables. Such equations should not be viewed separately but as systems
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Reduced form equation?

A

Reduce form equation – an equation that expresses a particular endogenous variable solely in terms of an error term and all the predetermined (exogenous plus lagged endogenous) variables in the simultaneous system. For example, the reduced form equations for the previous two equations would look as follows:
𝑌_1𝑡=𝜋_0+𝜋_1 𝑋_1𝑡+𝜋_2 𝑋_2𝑡+𝜋_3 𝑋_3𝑡+𝑣_1𝑡
𝑌_2𝑡=𝜋_4+𝜋_5 𝑋_1𝑡+𝜋_6 𝑋_2𝑡+𝜋_7 𝑋_3𝑡+𝑣_2𝑡
where 𝜋𝑠 are reduced form coefficient and they are also impact multipliers as they measure the impact on the endogenous variable of a one-unit increase in the value of the predetermined variable, after allowing for the feedback effects from the entire simultaneous system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why to use reduced-form equations

A
  1. Since the reduced-form equations have no inherent simultaneity, they do not violate Classical Assumption III.
  2. The interpretation of the reduced-form coefficients as impact multipliers means that they have economic meaning and useful applications of their own (i.e. no need to interpret the whole structural equation at once).
  3. Reduced-form equations play a crucial role in the estimation technique most frequently used for simultaneous equations – the Two-State Least Squares (2SLS) approach.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is instrumental variable?

A

instrumental variables – a variable that is highly correlated with endogenous variables and uncorrelated with the error term.
This can be done using the Two-Stage Least Squares (2SLS) approach – the method of systematically creating variables to replace the endogenous variables where they appear as explanatory variables in simultaneous equations systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

2SLS estimation stages

A
  1. Run OLS on the reduced-form equations for each of the endogenous variables that appear as explanatory variables in the structural equations in the system.
  2. Substitute the reduced form 𝑌̂𝑠 for the 𝑌𝑠 that appear on the right side (only) of the structural equations, and then estimate these revised structural equations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The Properties of Two-Stage Least Squares

A
  1. 2SLS estimates are still biased – the expected value of 𝛽̂ produced by 2SLS is not exactly equal to true 𝛽, but it is closer than what we can get with simple OLS, and this bias drops as the sample size grows.
  2. If the fit of the reduced-form equation is poor, then 2SLS will not rid the equation of bias – this is because if the instrumental variables are poorly estimated there is no reason to expect a 2SLS model to be well estimated (garbage in, garbage out).
  3. 2SLS estimates have increased variances and SE(𝛽̂)𝑠 – 2SLS creates relatively accurate estimate of 𝛽̂, but it comes with a caveat of increased variance and SE(β̂).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Vector autoregressive (VAR) model?

A

Vector autoregressive (VAR) model – a 𝑘-equation, 𝑛-variable, 𝑝-lags linear model in which each variable is explained by its own lagged values, plus current and past values of the remaining 𝑛−1 variables.

This framework provides a systematic way to capture rich dynamics in multiple time series.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the forms of VAR

A
  1. Reduced form VAR model – expresses each variable as a linear function of its own past values, the past values of all other variables being considered, and a serially uncorrelated error term (i.e. similar to reduced form equations discussed prior)
  2. Recursive VAR model – contains all the components of the reduced form model, but also allows some variables to be functions of other concurrent variables (i.e., similar to simultaneous equations).
  3. Structural VAR models – use economic theory to sort out the contemporaneous links among the variables. Structural VARs require “identifying assumptions” that allow correlations to be interpreted causally. These identifying assumptions can involve the entire VAR, so that all of the causal links in the model are spelt out, or just a single equation. This produces instrumental variables that permit the contemporaneous links to be estimated using instrumental variables regression (i.e., a more complex form of a simultaneous equation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What questions are important to be raised up in VAR?

A
  1. How many endogenous variables to include
  2. How many autoregressive terms to include

If we have two endogenous variables and two autoregressive terms, we have a Bivariate VAR(2) model. If we have three endogenous variables and four autoregressive terms we have a Trivariate VAR(4) model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Choosing the lags and the variables to include in the VAR model

A

To identify an optimal number of lags, you build several models using different numbers of lags and find the one with the best AIC (i.e. Akaike), BIC (Schwarz-Bayesian), or HQ (Hannan-Quinn). If the aforementioned approach produces different results, find the one that has the least issues, including autocorrelation, heteroscedasticity, and not-normally distributed residuals. That is, you do the same thing as you did for Granger causality estimation.

Deciding what variables to include in a VAR model should be founded in theory, as much as possible. Statistical tools can also be applied to identify relevant variables, such as Granger causality, to test the relevance of variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How the data could be described in VAR model?

A
  1. Use Granger causality
  2. Use Impulse response function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How Impulse response function describe data in VAR model?

A
  1. Estimating the VAR model
  2. Implementing a one-unit increase in the error of one of the variables in the model, while holding the other errors equal to zero.
  3. Predicting the impacts 𝑘-period ahead of the error shock.
  4. Plotting the forecasted impacts, along with the one-standard-deviation confidence intervals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where VAR model could be used for?

A
  1. Forecasting – similarly to ARIMA, VAR reduced form can be used to forecast by integrating it forward.
  2. Structural inference – an extension of the impulse response function allowing to capture how a change in one variable affects others in the future.
  3. Policy analysis – how the introduction of a policy would shift the equilibrium in the model in the long run.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are different types of analysis in econometrics?

A
  1. Survival analysis – a regression model that looks at a duration of time until an event happens.
  2. Counterfactual analysis – evaluating the impact of an intervention (e.g. introduction of a new policy) using regressions and similar approaches.
  3. Synthetic control methods – an extension of counterfactual analysis where you are comparing the impact of the intervention by simulating what would have happened if there had not been an intervention.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is classification?

A

Classification is a family of data mining and machine learning techniques that are used to build “rules” according to which you could separate your observations into some predefined groups.

17
Q

What are examples of classification?

A
  1. Predicting if an email is a spam or non-spam email
  2. Classifying texts according to some list of topics
  3. Classifying individuals into those that are likely to buy a specific product and those that are unlikely to do so
18
Q

What dimention reduction allows us to do?

A
  1. Visualise an N-dimensional data set in less dimensions (e.g. 2-3)
  2. Combine highly correlated variables into new variables
  3. Help some algorithms that might perform better with compressed data (e.g. k-means clustering).
    Dimension reduction should be performed when variables in a data set are highly correlated.
19
Q

What is text mining and what it is used for?

A

Text mining – application of data mining to non-structured or less structured text files. It entails the generation of meaningful numerical indices from the unstructured text and then processing these indices using various data mining algorithms. It uncovers previously unknown information.

It can be used to:
Find the “hidden” content of documents, including useful relationships
Relate documents across previous unnoticed divisions
Group documents by common themes
Summaries documents
And much more