Applied econometrics Flashcards

Exam

1
Q

How can we determine causality?

A

Optimally, through an experiment designed by researchers that randomly assigns subjects to treatment and control groups. Might also be through a quasi experiment that has a source of randomization that is “as if” randomly assigned. Causality is thus determined if only the specific variable is changed, while all other variables are controlled for

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we interpret variance and correlation?

A

Variance: squared unit measure of the standard variance/difference bewteen an observation and the mean. Correlation: linear relationship between two variables, will always be between -1 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the relationship between correlation and covariance?

A

p = cov(x,y) / std.dev(x)*std.dev(y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is cross-sectional data? Examples?

A

Usually a random sample, Each observation is a new individual with information at a point in time. Examples: Grade distributions, everyone’s mood at a specific point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is panel data? Examples?

A

Aka. longitudinal data. Following the same random individual observations over time. Example: A number of firms’ performance over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is time-series data? Examples?

A

Separate observation for each time period of a specific variable. Examples: Stock prices, inflation, commodity. In this, trends and seasonality should be taken into account

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the least square principle?

A

Choosing the estimates such that the residual sum of squares is as small as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we estimate b1?

A

Minimizing the SSR through FOCs, to end up with: Cov(x,y)/var(x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we estimate b0?

A

Minimizing the SSR through FOCs, to end up with: b0 = Y’ - b1*x’ . This can be thought of as normalising the bo to fulfilling the assumption of zero mean of the error term in OLS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is homoskedasticity? and what is the opposite?

A

Var(u|x) = 0, so the variance does not vary with x. Heteroskedasticity is the opposite, and here variance will vary with x,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What to do about heteroskedasticity?

A

Use heteroskedasticity-robust standard errors, such as White std.errors. Actually, you should always use heteroskedastic-robust standard errors, to ensure that external people will not judge your findings badly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is R^2? and Adj. R^2?

A

ESS/TSS or 1 - SSR/TSS. It is a measure of goodness of fit. Measure of how much of your data is explained by the regression. 1 - (SSR/(n-k-1))/(TSS/(n-1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a type I and a type II error?

A

Type I: Reject H0 when it is true. Type II: not rejecting H0 when it is false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How many degrees of freedom are needed to assume se=1.96?

A

120 - otherwise, look the SE up in the book on page 805

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula of the t-statistic?

A

t = (Y’ - y1,0)/se(y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which distribution function should be used for creating CIs for the variance?

A

The X^2-distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Can R^2 be negative?

A

The raw r-square can be negative with regression without a constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In multiple regression, when is beta1 equal to beta1 in a linear regression?

A

Two instances; when all other betas are equal to zero and when the observations in the multiple regression are uncorrelated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the properties of R^2?

A

Between 0 and 1, Can never decrease when another variable is added, cannot be used to compare different models, since it will always increase when another variable is added, here you can use adj. R^2 as long as y variable is the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Are OLS estimates unbiased?

A

No. When we say that OLS is unbiased under the assumptions, we mean that the procedure we used to get the estimates is unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Effects of rescaling variables: What happens when you change the Y variable?

A

It will lead to a corresponding change in the scale of the coefficients and standard errors, thus no change in interpretation or significance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Effects of rescaling variables: What happens when you change the X variable?

A

It will lead to a change in the scale of the coefficient and standard error, thus no change in the significance or interpretation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a standardized variable?

A

Variable subtracted mean and divided by standard deviation. Coefficients are then interpreted as the change in Y of 1 standard deviation change in X. There is no constant in this regresssion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is perfect multicollinearity?

A

A phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. Generally, if we observe few significant t-ratios, but high R^2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the consequences of high, but non-perfect multicollinearity?

A

OLS is still BLUE but: Large variances and covariances, precise estimation difficult, wider confidence intervals, t-ratio tends to be statistically insignificant, R^2 tends to be very high, OLS estimators and standard errors can be sensitive to small changes in data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is an auxiliary regression? How do we use it?

A

When tro variables are highly correlated, we can regress all our other variables against one of them and use the residual errors from the regression instead of the variable in the primary regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How can we detect multicollinearity?

A

Looking for: High R^2 values but few significant t-values, High correlation between two explanatory variables, Scatterplot, Auxiliary regressions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Interpretation of b1 in log-models: log-log, log-linear, linear-log?

A

Log-log: b1 is the elasticity of Y with respect to X. Log-linear: b1 is approx. the percentage change in Y with respect to x. Linear-log: b1 is approx. the change in Y for a 100 percentage change in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Why should we use a log model?

A

Log models are invariant to the scale of the variables since measuring percent changes. They give a direct estimate of elasticity. For models with y > 0, the conditional distribution is often heteroskedastic or skewed, while ln(Y) is much less so. The distribution of ln(Y) is more narrow, limiting the effect of outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is heteroskedasticity and what are the implications of heteroskedasticity?

A

OLS is no longer blue, there is biased standard errors, and the normal t- and f-statistics cannot be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is serial independence in autocorrelation?

A

When the covariance between the error terms are zero; they are independently distributed. If not, there is autocorrelation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the effect of adding a dummy variable? And an interaction term?

A

Dummy variable can be thought of as changing the intercept. Interaction term bewteen a dummy and a continuous variable can be thought of as changing the slope (and intercept?).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What does the Chow Test test for?

A

It tests if one regression line or two different regression lines best fit the data. If the two coefficients are equal, the null hypothesis can be rejected; two lines fit better than one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What are the problems of having a dummy variable as dependent variable in the linear probability model?

A

Probabilities/ the prediction can lie outside [0;1], therefore, use Probit or Logit. Also, the error, u, has a discrete, non-normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What are the properties of the probit and logit function?

A

Pr(y=1|X) = phi(beta0+beta1X) . 01

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

In panel data, is omitted variables a problem?

A

No, assuming the ommited variable does not change over time, the change in Y must be caused by the observed factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Which methods for fixed effects regression do we use? and how?

A

1) changes specification method (not OLS) 2) n-1 binary regressors method 3) entity demeanded fixed efffects method method

38
Q

What does endogeneity mean? what is the opposite?

A

That a regressor is correlated with the error term. Exogeneity, when the variable is uncorrelated to the error term.

39
Q

What is simultaneity and reverse causality?

A

“Simultaneity: X causes changes in Y and Y causes changes in X,
Reverse Causality: Y causes changes in X.”

40
Q

What are the two assumptions of including an instrumental varable?

A

It should be relevant (there should be a (strong) correlation to the explanatory variable) and exogenous (there should be no correlation to the error term)

41
Q

In which cases do we use instrumental variables? and how do we use them?

A

When the independent variable is correlated with the error term. Løs funktion for beskrive x, indsæt herefter dette estimat som xhat i funktionen for y

42
Q

What is the estimated beta1 for an instrumental variable? and how is it obtained?

A

Cov(Zi,Yi)/Cov(Zi,Xi). The beta is obtained through a two-stage procedure. First, you estimate an OLS regression of Xi on Zi (and other exogenous regressors). Second, you estimate the principal/original equation substituting Xi from the first equation.

43
Q

What happens with fewer instrumental variables than original variables? With more?

A

Fewer: mk; over-identified and one should test relevance witht the J-statistic

44
Q

Instrumental variables: What test is used to test the hypothesis that Z1, …, Zm do not enter the first stage?

A

The F-statistic. Weak instruments imply small first stage F-statistics

45
Q

Instrumental variables: What test is used to test the second assumption of whether an IV-variable is exogenous?

A

The j-statistic. Can only be used when the regression is overidentified, and it tests whether the error is explained/correlated to the instrumental variable. The null-hypothesis is, that the terms are uncorrelated.

46
Q

What are some of the drawbacks of using IVs?

A
  1. It is difficult to find good estimates that captures all of the variance of the endogenous variables, which is not correlated with the error. 2. The instrument is often not well correlated with the endogenous variable, which is a problem (weak instrument/low relevance). 3. The OLS standard errors from the second stage regression are not right (use the ones provided by the software).
47
Q

What are the differences between an experiment, a quasi-experiment and a program evaluation?

A

An expirement is an consciously implemented study that randomly assigns subjects to treatment and control groups. A quasi-experiment has a source of randomization “as if” randomly assigned, but this variation was not the results of an explicit randomized treatment and control design. Progam evaluation is the field of statistics aimed at evaluating the effect of a program or policy (ad campaign to stop smoking e.g.)

48
Q

What is internal validity? what is external validity?

A

Internal validity refers to the validity of the findings within the research study. It is primarily concerned with controlling the extraneous variables and outside influences that may impact the outcome. External validity refers to the extent to which the results of study can be generalized or applied to other members of the larger population being studied.

49
Q

What are threats to internal validity?

A

1) Failure to randomize (X is correlated with u), failure to follow treatment protocol, attrition, experimental effects change individual behaviors

50
Q

What are threats to external validity?

A

1) Nonrepresentative sample, 2) Nonrepresentative “treatment”, 3) General equilibrium effects

51
Q

What effects does it have to include a regressor not correlated with the other regressor?

A

As there is no correlation, from the beginning there was no ommited variable bias. However, including W, the error term is reduced and there will be smaller standard errors

52
Q

How do we use difference-in-differences?

A

Testing the effect of treated versus control group regressing delta Y against the mean of the treated and control group before and after treatment

53
Q

What is the difference between sharp and fuzzy regression discontinuity design?

A

In sharp RDD, crossing the threshold increases the probability of treatment from 0 to 1, and in fuzzy RDD, crossing the threshold increases the probability of treatment more slowly than from 0 to 1

54
Q

What is a dynamic causal effect?

A

When a change today has a causal effect several periods forward, thus estimating the effect of current and past changes in x on y.

55
Q

What is the first difference of the logarithm of Yt? Is this a precise measurement?

A

d = delta, 100*dln(yt) = ln(yt) - ln(yt-1). Not a precise measurement, but an estimate of the percentage change.

56
Q

What is autocorrelation?

A

A series exhibiting autocorrelation is related to its own past values

57
Q

How do we calculate the autocovariance?

A

Cov (yt, yt-j) = E(yt - Eyt-j)*(yt-j - Eyt-j)

58
Q

What is the coefficient with equal variances across periods?

A

gamma(j)/gamma(0)

59
Q

In AR, how is the order of tests giving number of lags?

A

F/t>=AIC>=BIC

60
Q

What does granger causality?

A

F-test that added x-variables to an AR regression (making it an ADL) have a coefficient that is significantly different than 0. Be aware that granger “causality” does not test for for causality, but rather whether the variable is a good predictor for Y.

61
Q

What is stationarity?

A

A time series is stationary if its probability distribution does not change over time.

62
Q

When is an AR(p) stationary?

A

When the roots lie outside the unit circle

63
Q

When is the AR(1) model stationary?

A

When the coefficient of Yt-1 is less than one

64
Q

How is a trend defined? what are the two types and the difference between them?

A

It is a persistent long-term movement of a variable over time. Deterministic trend is a nonrandom function of time. A stochastic trend is random (random walk model) and varies over time, also it can have a drift.

65
Q

How do we define the random walk model? and with a drift?

A

The normal random walk model is: yt+1 = Yt + ut. With a drift, Boh is included: yt+h|T = yt + Boh + ut. All errors are identically, idenpendently distributed.

66
Q

Why is a random walk model nonstationary?

A

Because the distribution of yt changes over time, that is: Var(Yt) = Var(Yt-1) + Var(Ut). But in order for Yt to be stationary, the variance must be constant over time, that is. Var(Yt) = Var(Yt-1), which implies that Var(Ut) = 0, and that cannot be true, since we know that Var(Yt) = t*var (ie variance is increasing in time as model becomes “more random”.

67
Q

How can you make a nonstationary model stationary?

A

This can be done be making an integrated model (taking the differences). We say that a nonstationary series is integrated id its nonstationarity is appropriately “undone” by differencing.

68
Q

How do we avoid problems caused by trends in times series?

A

To handle stochastic trend, take the first difference –> it becomes stationary. BE AWARE we consider trends stochastic as most other brilliant econetricians! nonetheless, To handle deterministic trend, remove the trend. run the regression Yt=alpa0 + alpha1t + e_t and use the residuals for modeling purposes

69
Q

Whatt is the ADF test?

A

Augmented Dickey Fuller tests for a stochastic trend in the regression. Null hypothesis is that we have a unit root, alternative hypothesis is that beta<1

70
Q

What is a structural break?

A

Breaks arise from a change in the population regression coefficients at a distinct time or from a gradual evolution of the coefficients. Thus, it is another type of nonstationarity.

71
Q

What problems does structural breaks cause?

A

We get a relationship that holds on average.

72
Q

How do we detect breaks?

A

for known time of potential break: chow test for no difference before and after date. With unknown date: Use QLR statistic for running f-statisticstics on a range of values and pick the one with the highest value (note; we use a different distribution)

73
Q

What is ARMA and ARIMA models? What are p, d and q?

A

ARMA: Combination of an autoregressive function and a moving average funtion. ARIMA includes an integrated part, which controls for the number of unit roots. Thus your ARIMA is defined by: p, d and q. P is the order (number of lags) of the autoregression, d is the number of unit roots (ie the number of times you have to take the difference to eliminate the effect of stochastic trends), and q is the number of lags you use to estimate the moving average.

74
Q

When are residuals considered to be white noise?

A

When they are not autocorrelated. This can be seen from creating an ACF graph of the residuals, once you have decided on your ARMA/ARIMA model.

75
Q

In time series data, how do we think of a randomized controlled experiment?

A

The same subjects are being given different treatments at different points in time

76
Q

What is a distributed lag model?

A

When Yt is only lagged on X-values

77
Q

What are the two different types of exogeneity of Xt in an distributed lag model?

A

Past and prestent exogeneity (All of the coefficients in the distributed lag model constitute all the non-zero dynamic effcts). Strict exogeneity: past, present AND future error terms has a mean of zero given values of Xt. Note: strict exogeneity implies past/present exogeneity, though the opposite is not true.

78
Q

When do we need to use HAC?

A

In a distributed lag model with autocorrelation between error terms

79
Q

What is the long-run dynamic multiplier?

A

The sum of all the dynamic multipliers in the distributed lag model.

80
Q

When should we use HAC-robust SEs? (heteroskedasticity and autocorrelation consistent)

A

When we have a distributed lag model with autocorrelated errors, AND NOT when we have an AR, ARMA, ADL or ARIMA model. Because in that case we have included a sufficient number of lags, which implies that we have no autocorrelation in the errors anymore.

81
Q

When do we use VAR models?

A

When want to develop a model that forecasts several variables and makes forecasts mutually consistent.

82
Q

What is the Cross-correlation function?

A

It is the analog of ACF for the multivariate case: it is a correlation between a variable and lags of another variable.

83
Q

Forecasting with VARs (and AR) uses the chain algorithm; how does this work?

A

To forecast two periods in the future, we first forecast YT+1 and uses this in the equation to get Yt+2. This is iterated until period h

84
Q

What is linear dependence?

A

In the theory of vector spaces, a set of vectors is said to be linearly dependent if one of the vectors in the set can be defined as a linear combination of the others; if no vector in the set can be written in this way, then the vectors are said to be linearly independent.

85
Q

What is a fixed effects model?

A

A regression performed on panel data to test the effect of being in state i. The model can be either entity demeaned, time demeaned or both. All regressions will have the same slope, but different intersections.

86
Q

When do we use clustered standard errors?

A

When the variance of the regression is different in different time perioed. We then use Arch or Garch models to correct our model.

87
Q

What is a partial autocorrelation function?

A

The autocorrelation between yt and yt-k, controlling for all intermediate y-t-k observations.

88
Q

What is the difference between ARMA and ARIMA?

A

The integrated part, which is the numbers of differences, d, that is to be taken to remove the stochastic trend in the data.

89
Q

What is an ADL model? How is it different than an AR model?

A

ADL models include laged x values in addition to lagged y-values, while AR only consists of lagged y-values.

90
Q

What is a moving average model? What is it used for?

A

A moving average model regresses y on the present and past (white noise) error terms and is used to improve an AR function in forecasting future values.