Econometrics Flashcards

1
Q

What would the hypothesis test be for a positive correlation?

A

H0: ß ≤ 0
H1: ß > 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does TSS stand for?

A

Total Sum of Squares. The total sum of squares is a variation of the values of a dependent variable from the sample mean of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does RSS stand for?

A

In statistics, the residual sum of squares (RSS), also known as the sum of squared estimate of errors (SSE), is the sum of the squares of residuals (deviations predicted from actual empirical values of data). It is a measure of the discrepancy between the data and an estimation model, such as a linear regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does ESS stand for?

A

Explained Sum of Squares. Explained sum of square (ESS) or Regression sum of squares (RSS) or Model sum of squares is a statistical quantity used in modeling of a process. ESS gives an estimate of how well a model explains the observed data for the process.
It tells how much of the variation between observed data and predicted data is being explained by the model proposed. Mathematically, it is the sum of the squares of the difference between the predicted data and mean data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the formula that links R-Squared with ESS and TSS?

A

R-Squared = ESS/TSS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the formula that links RSS, TSS, and ESS?

A

RSS = TSS - ESS
TSS = RSS + ESS
ESS = TSS - RSS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 4 steps to perform a hypothesis test?

A

1) Hypothesis - State the hypotheses:
Null hypothesis (H0): This is the default assumption or claim that you want to test. It represents no effect or no difference.
Alternative hypothesis (Ha): This is the claim or effect you want to support if there is sufficient evidence against the null hypothesis. It represents the direction of the effect or difference.
2) Test Statistic - Calculate the test statistic by first calculating the standard error
3) Critical Value - Find the critical region by calculating the critical value using the significance level, type of tail test, and degrees of freedom
4) Decision - Compare the test statistic to the critical value or p-value.
If the test statistic falls in the critical region (beyond the critical value) or the p-value is smaller than the significance level (α), reject the null hypothesis.
If the test statistic falls in the non-critical region (within the critical value) or the p-value is larger than the significance level (α), fail to reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which test would one use to assess the statistical significance of the regression coefficients?

A

T-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which test would one use to assess if the the overall regression is statistically significant?

A

F Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the 6 steps to conduct a T-Test?

A

1) State the null hypothesis (H0) and the alternative hypothesis (Ha):
2) Calculate the test statistic:
Calculate the sample means (x̄1 and x̄2) for group 1 and group 2.
Calculate the sample standard deviations (s1 and s2) for group 1 and group 2.
Calculate the standard error of the difference between the means using the formula:
SE = sqrt[(s1^2 / n1) + (s2^2 / n2)], where n1 and n2 are the sample sizes of group 1 and group 2, respectively.
3) Calculate the t-value using the formula:
t = (x̄1 - x̄2) / SE
4) Determine the degrees of freedom (df):
The degrees of freedom can be calculated using the formula:
df = n1 + n2 - 2, where n1 and n2 are the sample sizes of group 1 and group 2, respectively.
5) Determine the critical value or p-value:
Look up the critical value from the t-distribution table based on the chosen significance level and degrees of freedom.
6) Decision
If the absolute value of the t-value is greater than the critical value, reject the null hypothesis.
If the p-value is smaller than the significance level (α), reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 6 steps to conduct an F Test?

A

1) State the null hypothesis (H0) and the alternative hypothesis (Ha):
2) Determine the sample sizes (n1, n2, …, nk) and the sample variances (s1^2, s2^2, …, sk^2).
3) Calculate the test statistic:
Calculate the ratio of the largest sample variance to the smallest sample variance:
F = (s1^2 / s2^2) or (s2^2 / s1^2), depending on which variance is larger.
Note: The order of calculation for the ratio is important, as F is always a positive value.
4) Determine the degrees of freedom:
The degrees of freedom for the numerator (df1) is equal to the number of groups minus 1 (k - 1).
The degrees of freedom for the denominator (df2) is equal to the total sample size minus the number of groups (n - k), where n is the total sample size.
5) Determine the critical value or p-value:
Look up the critical value from the F-distribution table based on the chosen significance level, df1, and df2.
6) Decision
If the F-value is greater than the critical value, reject the null hypothesis.
If the p-value is smaller than the significance level (α), reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define the F Test

A

An F-test is used to test whether two population variances are equal. The null and alternative hypotheses for the test are as follows:

H0: σ12 = σ22 (the population variances are equal)

H1: σ12 ≠ σ22 (the population variances are not equal)

The F test statistic is calculated as s12 / s22.

If the p-value of the test statistic is less than some significance level (common choices are 0.10, 0.05, and 0.01), then the null hypothesis is rejected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define the T Test

A

A two sample t-test is used to test whether or not the means of two populations are equal.

A two-sample t-test always uses the following null hypothesis:

H0: μ1 = μ2 (the two population means are equal)
The alternative hypothesis can be either two-tailed, left-tailed, or right-tailed:

H1 (two-tailed): μ1 ≠ μ2 (the two population means are not equal)
H1 (left-tailed): μ1 < μ2 (population 1 mean is less than population 2 mean)
H1 (right-tailed): μ1> μ2 (population 1 mean is greater than population 2 mean)
The test statistic is calculated as:

Test statistic: (x1 – x2) / sp(√1/n1 + 1/n2)

where x1 and x2 are the sample means, n1 and n2 are the sample sizes, and where sp is calculated as:

sp = √ (n1-1)s12 + (n2-1)s22 / (n1+n2-2)

where s12 and s22 are the sample variances.

If the p-value that corresponds to the test statistic t with (n1+n2-1) degrees of freedom is less than your chosen significance level (common choices are 0.10, 0.05, and 0.01) then you can reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for TSS?

A

∑y^2 -n(ȳ)^2

Where:

Σ denotes the sum of
y represents each observed value of the dependent variable
ȳ represents the mean of the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the formula for standard error?

A

se(β ̂ ) = √(σ ̂^2/(∑(X-X ̅ )^2 )) = √(σ ̂^2/(∑X^2 -nX ̅^2 ))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What do n and k refer to respectively?

A

n refers to the sample size

k refers to the number of variables

17
Q

What does the “hat” mean in statistics?

A

In statistics, a circumflex (ˆ), called a “hat”, is used to denote an estimator or an estimated value.

18
Q

What does the “bar” mean in statistics?

A

The “bar” is used to denote the mean of the value.

19
Q

What is the formula for the estimated variance?

A

RSS / n-k

20
Q

What is the difference between variance and standard deviation?

A

Standard deviation is the spread of a group of numbers from the mean.
The variance measures the average degree to which each point differs from the mean.
While standard deviation is the square root of the variance, variance is the average of all data points within a group.
The two concepts are useful and significant for traders, who use them to measure market volatility.

21
Q

How does one calculate the degrees of freedom for a simple linear regression?

A

Simple Linear Regression (One Independent Variable):

The degrees of freedom for a one-tailed test in simple linear regression can be calculated using the formula:
df = n - 2
Where n is the number of observations (sample size).
In simple linear regression, you have one independent variable (regressor) and one dependent variable.

22
Q

How does one calculate the degrees of freedom for a multiple linear regression?

A

Multiple Linear Regression (Multiple Independent Variables):

In multiple linear regression, where you have multiple independent variables, the formula for calculating the degrees of freedom depends on the number of independent variables (p) and the number of observations (n):
df = n - p - 1
Where n is the number of observations (sample size) and p is the number of independent variables (regressors).
The “- 1” term accounts for the loss of one degree of freedom due to estimating the intercept term.

23
Q

What is a nested model?

A

A nested model is simply a regression model that contains a subset of the predictor variables in another regression model.

For example, suppose we have the following regression model (let’s call it Model A) that predicts the number of points scored by a basketball player based on four predictor variables:

Points = β0 + β1(minutes) + β2(height) + β3(position) + β4(shots) + ε

One example of a nested model (let’s call it Model B) would be the following model with only two of the predictor variables from model A:

Points = β0 + β1(minutes) + β2(height) + ε

We would say that Model B is nested in Model A because Model B contains a subset of the predictor variables from Model A.

However, consider if we had another model (let’s call it Model C) that contains three predictor variables:

Points = β0 + β1(minutes) + β2(height) + β3(free throws attempted)

We would not say that Model C is nested in Model A because each model contains predictor variables that the other model does not.

24
Q

What is the difference between homoscedasticity and heteroscedasticity?

A

If the variance is somewhat the same, the function will be homoscedastic. Therefore, the degree of scatter will be less. In such a case, estimates of the standard error (deviation from the mean) tend to be unbiased. As a result, the overall test results will be promising.

On the contrary, if the observed values appear irregularly from the regression line, the degree of scattering increases, leading to a difference in variances. Hence, there will be no easily recognizable pattern, and the values will appear random. This can lead to a bias in standard error estimation, thus contributing to less reliable test results.

25
Q

What is the difference between the control and the treatment groups in economic policy?

A

Control Group:

The control group is a group of individuals or entities that do not receive the policy intervention or treatment.
The control group serves as a reference or baseline against which the effects of the policy intervention are measured.
The control group is generally kept under similar conditions as the treatment group, except for the absence of the specific policy intervention.
By comparing the outcomes or behaviors of the control group with those of the treatment group, it helps in isolating the impact of the policy intervention.

Treatment Group:

The treatment group is a group of individuals or entities that receive the policy intervention or treatment.
The treatment group represents the group that is directly affected by the policy change or program implementation.
The treatment group typically receives the new policy or program under evaluation, and its outcomes or behaviors are compared to the control group to assess the effectiveness or impact of the intervention.
The treatment group allows policymakers and researchers to analyze and evaluate the effects of the policy intervention relative to the counterfactual scenario where the intervention is not implemented.

The control and treatment groups are fundamental components of experimental or quasi-experimental study designs used to evaluate the causal effects of economic policies or programs. By comparing outcomes between these groups, policymakers can gain insights into the effectiveness, efficiency, and impact of specific policy interventions on various economic and social outcomes.

26
Q

What is Difference-In-Difference (DiD)?

A

Difference in Difference (DiD) is an “identification strategy” to identify the effect of policy using a control (C) group and a treatment (T) group pre and post policy change.

To observe the policy effect, one must first difference-out the pre-existing differences between groups C and T in addition to any temporal effects.

Then, using the DiD equation one can capture the DiD policy effect by calculating the key model estimate, which is also referred to as Average Treatment Effect (ATE).
See D-in-D Econometric Model Interpretation table.

27
Q

What does it mean if the policy effect is “time invariant”?

A

It does not vary over time

28
Q

Briefly define the Fixed Effects Least Squares Dummy Variable (LSDV) Model

A

Pool data and give each set its own intercept dummy.

29
Q

Briefly define the Fixed Effects Within-Group Model

A

Pool data and express each variable as a deviation from the mean value - OLS mean correlated.