Term One Flashcards

1
Q

What are the three Ln power rules?

A

ln (X.Y) = LnX + LnY

Ln (X/Y)= LnX - LnY

LnX^y= yLnX

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What makes a model dynamic?

A

it incorporates data from other time periods. i.e:

yt= Co+ B1Yt-1 + B2Xt+ Ut (DYNAMIC MODEL)

Yt= Co +B2Xt + Ut (STATIC MODEL)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between Cardinal and Ordinal Measurement?

A

Cardinal is the measurement of something’s magnitude, i.e; how large is it.

Ordinal is the measurement of how a variable is ranked amongst other variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the three types of economic data?

A

Time- Series: measures a certain variable across a number of time periods.

Cross-Sectional: Is the measurement of a sampled variable at a single point in time.

Pooled Cross- Sectional: Is where two or more cross=-sections are combined to create a single data set.

Panel/Longitudinal Data: Is data that contains both a time-series and a cross-sectional element to it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the variable U represent?

A

it is a random error term, it captures variables which may not be in the model but their effect can be observed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the defintion of;

Deterministic

Stochastic

A

Deterministic is constant, it defines usually a trend.

Stochastic means a random variable or trend.

From Greek stochos or to guess.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the eight classical assumptions about Ut?

A

1) There is Zero mean. The expected value of Ut is zero.
2) Homoscedasticity, which means constant varience. V(ut)=2
3) Ut and Us are independent for all values where t does not equal s Cov(Ut,Us) = 0
4) Cov(Xt,Ut) = 0 or X is fixed in repeated samples.
5) The regression line is linear in its coefficients.
6) n>k number of observations greater than the number of regressors. (degrees of freedom)
7) X takes a number of different values otherwise X=X bar
8) Random errors are distributed normally. (By the way of a normal curve if you looked at their distribution).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How to find a residual?

A

Yt=Yt hat + Ut hat hat shows estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to find the RSS

A

You sum from t=1 to t=n for (Yt-a-bXt)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give a different formula to calculate b hat. explain why

A

Cov(X,Y)/Var(X). This is because the derived version of the formula with both num and denom divded by n-1 will give the above variables. However, you don’t have to do that as the n-1 will cancel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How to calculate R squared?

A

R squared = ESS/TSS (Formulas given in the formula sheet).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give an example of both one and two tailed hypothesis testing.

A

Two tail: Ho: b=0 H1: b not equal 0

One Tail H0: b0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What do you differently in hypothesis testing between a one tail and two tail alternative?

A

t distriubtion shows a two tailed alternative. You go for the t stat for the significance level the next stage higher than the one that you are looking for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does R squared show?

A

It shows the fraction of samle variation in Y that is explained in X.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to carry out a t-test on a multiple regression set?

A

Set Ho: b=o and H1: b not equal 0.

t (n-k) @5%

Where K is the number of explanatory variables, including the constant; b1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you compute a confidence interval for a small sample?

A

b+- t (n-k) @5% x s.e(b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you conduct an F-test with a multiple regression model?

A

You find RSSu of unrestricted Model.

You then provide restrictions to the model.

Find RSSr.

Put them into the F-test formula.

Where d is the number of restrictions in the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Give a formula for R^2

A

ESS/TSS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an alternative formula for the general hypothesis test?

A

Fval= (ESS/(K-1))/(RSS/(n-k))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What happens if you cannot find the exact degrees of freedom on the formula table that you need?

A

You go for the degrees of freedom that is closest to the one that you are trying to calculate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is Multicollinearity?

A

It is where movements in one explanatory variable is closely matched by movements in another explanatory variable.

The consequence is that it may not be possible to estimate the separate effects of each explanatory variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is perfect Multicollinearity?

A

it is where two explanatory variables are exactly linearly related.

i.e; X2= 2+ 3X3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is one reason for Multicollinearity?

A

Dummy Variable trap

where sum of dummy variables is equal to the constant.

If you include all dummy variables, then the model will have exact linear dependence and cannot be solved.

It is a case of perfect MC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Imperfect MC?

A

It is where there is a linear relationship between variables, plus a random error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What will show up Imperfect M/C?

A

It will be shown in the statistical precision. i.e: there will be very high t-test statistics .

However you could still estimate Bi.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the concequences or symptoms of M/C?

A

OLS estimates are still BLUE but, variances and co-variances are likely to be large.

Difficult achieve precise coefficient estimates.

Statistical inference will be problematic with wide confidence intervals and statistical insignificance likely (i.e. low t-ratios) but R2 will be high (results look “strange”).

Hence, regression equation appears to explain the dependent variable well but no individual explanatory variable appears significant.

Results can be sensitive to small changes in the sample.

So if add/delete a few observations in a sample can see large changes in significance of parameter estimates.

The residuals are not affected, F test reliable.

Estimated parameters subject to error, T test TS low.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What suggests high collinearity?

A

When you include both explanatory variables and their standard error’s increase to much more than when they are in the models on their own.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How to detect Multicollinearity?

A

You conduct a t test on the partial correlation coefficient.

r = sample correlation coefficient

s = square root ((1-r^2)/(n-2)).

You can also inspect the R^2 and F-test. If the R^2 is high and F-test is significant, then you trial and error by removing coefficients until you find the problem one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the Varience Inflation Factor?

A

it is 1/(1-R^2).

If it is greater or equal to ten then it is said to be highly collinear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are the solutions to M/C?

A

get better data

Re-specify the model and reduce explanatory variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the adjusted coefficient of determination?

A

It is R bar squared, the formula is 1- (1-r^2)((n-1)/n-(k+1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the null and alternate hypothesis for the validity of statistical coefficients?

A

Ho: B1=B2=0

H1: B1 not equal to B2 not equal to zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Give another way that you can detect Multicollinearity

A

It is where R squared does not decrease significantly even though you respecify the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

How do you calculate the elasticity of a linear regression?

A

If you know that e (y,x) =Dy/DX x X/Y

We know that DY/DX = b

therefore bhat x Xbar/ Ybar is the general estimation.

Note that you can input all of your Y X combinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the elasticity of a log-linear regression?

A

It is b as it is dlny/dlnx x X/Y answer

36
Q

What is the elasticity of Semi - log regression.

1) Lnyt = a +bXt+ut.
2) Yt = a + blnxt + ut

A

Lnyt = a +bXt+ut.

Dlnyt/Dxt = 1/y x dy/dx = b

b is an estimate of the proportion.

b Xbar is the elasticity.

Yt = a + blnxt + ut:

Dyt/Dlnxt = Dyt/Dxt x Xt.

b/Ybar

37
Q

how to find the elasticity of quadratic functional form?

A

Y=bo+b1X+B2X^2

Dyt/Dxt = b1+2b2Xt

Dyt/Dxt x X/Y=

Elasticity

(b1+2b2Xbar) x (Xbar/Ybar)

38
Q

How to find the elasticity for a multiple regression?

A

it is the same as normal, you just partially differentiate.

yt= a +bxt +cZt + Ut

E(Y,X)= Partial y/Partial x x Xbar/Ybar

E(Y,Z) = Partial y / Partial Z x Zbar/Ybarq

39
Q

When can you see the constant elasticity form? What can it be use for?

A

When the equation is in log log form.

It can also be used for production analysis.

40
Q

how to find max earnings age from lnw = bo + b1A + b2A^2 + b3S ?

A

You do Dlnw/DA= b1+2b2A=0

A= -b1/2b2

and that is the max earnings age. If you get a positive value, that should be correct.

If you do a second derivative you can tell if it is a max or not.

Also if b2

41
Q

What are the types and signs of model misspecification?

A

Types:

  • Omitting relevant variables.
  • Including Irrelevant variables
  • Choosing the wrong functional form.
  • Biased measurement of model errors

Signs:

  • residuals show a certain pattern.
  • If you put a statistically significant Ybar in the regression and the R^2 goes up.
42
Q

What is the Ramsey RESET test a measure of?

A

It checks to see if there has been any misspecification of the functional form.

43
Q

How to Run a Ramsey RESET Test?

A

1) take you model

y=b1+b2X2+U

Then run the test and remember Y value.

2) Redo the model by putting Y value estimate back in, in terms of squared and higher order terms.

Y= B1+B2X2+B3Y^2+B4Y^3+U

3) You then find the R squared values of both one and two and run an F test.
4) (R2-R1/d)/(1-R2)/(n-k)

d = number of new regressors. 
k = the number of parameters in equation number 2. 

5) Complete the F-Test using F(K, n-)

If you reject Ho, you are saying that there is evidence of misspecification.

44
Q

What does it mean if something is homogenous of degree zero. e.g a demand equation

A

It means if you increase income and prices by the same amount, there should be no change in demand.

45
Q

What is the Chow Test a test for?

A

It is a test for ‘structural stablity’ it ensures that the coefficients are constant over time

46
Q

How to conduct a chow test?

A

You split the data into two sub periods and find RSS, knowing RSS from original, you then put them into the formula and compute an F-stat. This will give you something to compare with F(d, n-2k).

47
Q

What is the Null and Alternative hypothesis in a chow test?

A

Ho: Coefficients remain constant over the sample.
H1: Coefficients vary across the sample.

48
Q

What is Autocorrelation?

A

It is where errors are correlated with other past values.

It violates the OLS assumption of;

Cov(Ut,Us)=0 where t does not equal s.

49
Q

What is the difference between positive and negative correlation?

A

Negative; errors negatively correlated with the past.

Positive; errors positively correlated with the past.

50
Q

What is first order autocorrelation?

A

Where error is correlated with the error that immediately precedes it.

i.e: t and t-1.

51
Q

What are the causes of Autocorrelation?

A

Misspecifying a dynamic model by ommiting a lagged variable.

Generally misspecifiying a model.

You apply a transformation to the data.

Error term is dynamic, it is determined independently of the model.

An economic shock with persistent effects.

52
Q

What are the consequences of autocorrelation?

A

OLS does not have minimum varience (not blue).

Standard errors of coefficients will move downwards, meaning all inference will be wrong.

The estimates of coefficients will be biased in a dynamic model

R^2 will appear high.

53
Q

What is the formula for the Durbin Watson statistic?

A

DW = sum of(Ut - Ut-1)/sum of Ut^2.

54
Q

What are the hypothesis for the DW test

A

Ho: p=0

H1: p not equal to 0.

Ho: no positive AC
Ho*: no negative AC.

55
Q

What are the decision bounds for DW test?

A
0-Dl: Reject Ho for postive
Dl-Du: indecision
Du to 4-du: Do NOT reject.
4-du to 4-dl: indecision. 
4-Dl to 4: reject for negative.
56
Q

Where do you get dU and dL from for the Durbin Watson test?

A

You find it in the tables for the values of K and N in your model.

57
Q

What is the Breusch-Godfrey test?

A

It tests for autocorrelation where Ho:p=0.

1) You have a model and you find the residuals.
2) From this you estimate an ‘auxiliary regression’ from the residuals.

ut= pUt-1 + ao+a1X1 +a2X2+vt.

3) Compute the ChiSQ(1) =nR^2aux with 1 degree of freedom.
4) You can also do the F stat of the aux R^2 with 1 DG freedom

k-1 = 1.

Then compare.

58
Q

What is the Advantage of the b-g (CHISQ) test?

A

It can test for autocorrelation of higher than first order.

59
Q

What is a dummy variable?

A

It is a variable that can take the value of 1 or 0. It can help divide your data into subgroups.

60
Q

Give an example of a Binary Dummy.

A

That is a dummy that decides if the data is Male or Female.

61
Q

What is Homoscedasticity and heteroscedasticy?

A

Homo: Where all of the values have a constant variance.

E(Ut^2) = sigma^2

Hetero: If the conditional varience of Y increases as X increase, we see heteroscedasticity.

So as X predicts values of Y, the variance of Y changes.

Where the varience increases by a multiple of a specific explanatory variable.

62
Q

What are the causes of Heteroscedasticity?

A
  • You improve your data collecting skills through time so your errors become smaller and hence the variance will change.
  • Change the sample where you got the data will change the outliers which are present.
  • Individuals carrying out procedure will learn through time, so errors get smaller.
  • If it is a model regarding income, then as their income increases through time, their discretionary income will increase. This means the variance of consumption amounts will increase considerably.
  • Incorrectly specified the model.
  • Skewed distribtion of the data.
  • Tends to be a problem with cross-section as you are looking at data at one point in time.
63
Q

What are the concequences of Heterskedasticity?

A

It will cause;

  • Biased standard errors
  • Biased Test stats.

B will not be biased but OLS is now not BLUE.

this is because the OLS estimates that it produces are not of minimum varience.

It gives same weighting to all observations when it should give more weight to observations with smaller disturbance variation.

  • For the t-test, the value of S(e) is lower providing higher t t-test values. This means that you are more likely to reject Ho and say that B is statistically significant even when it is not.
64
Q

How would you graphically test for heteroscedasticity?

A

if you see a fanning out pattern in the residuals, it means the variance is changing and there is the presence of Heteroscedasticity.

65
Q

How can you test for heterscedasticity using a t test.

Goldfeld -Quant test

A
  • Rank all of your data low to high.
  • Split it into small medium and large sub groups.

-Omit the middle sub group.
(n=30 omit 8, n=60, omit 16).

-Run a regression on high and low and find the variances of the two respectively.

Then T test S^2h/S^2l

where s^2 is only RSS value.

F (n1-k)(n2-k)

N is the number of values in each independent sub section!!!!!!

Reject Ho shows the presence of Heteroskedasticity.

66
Q

What are the null and alternative hypothesis for heteroskedasticity test?

A

h0: sigmasquaredhigh=sigmasqauredlow Which shows HOMO
h1: sigmasquaredhigh > sigmasquaredlow

Which shows HETERO

67
Q

How to test for Heteroskedasticity using White’s General hetero test?

A

1) Estimate a model;

y=B1+B2X2+B3X3+U

and find the residuals.

2) Run the test;

U^2=a1+a2X2+a3X3+A4X2^2+a5X3^2+a6X2X3 +V.

3) Ho: homo.

Using the CHISQ test

n.R^2 to X^2 k-1
Ts>CV reject ho.

Also;

If there is a p value from the t stat and you find that P

68
Q

How to transform the data to deal with heteroskedasiticity of the form: sigma^2x^2

A

First you must dive by X to give

Var(e/x) = (1/x) ^2var e.

It must squared when you take it out of the variance operator.

You then times it by the original varience, to give;

1/X^2 times sigma^2X^2.

This gives var (e) as sigma ^2.

To ensure the model remains valid, you must employ this to all of the data.

So if you had;

Y=B1+B2x +e

you now have Y/x = B1/x +b2 +e/x

Y* =B1* + B2 +e*

Where the star variables are explained above.

The classical assumption is now satisfied and the hetero has been removed.

69
Q

Problems with Hetero Transformations?

A
  • It is not always clear what X needs transforming.
  • Log transformation can be a problem if X and Y take negative or 0 values.
  • If you are not sure of the variance, it can be hard to know what Ho test to do.
70
Q

How does GLS correct for Hetero?

A

OLS minimises the sum of squred residuals .

GLS minimises the sum of weighted squared residuals.

sum of; (y-Y)^2 / var (e)

So the lower the error variance, the higher the representation it will get in the data.

71
Q

If a regressor that has non-constant variance is (incorrectly) omitted from a model, the (OLS) residuals will be heteroscedastic. true or false

A

False. Heteroscedasticity is about the variance of the error term ui and not about the variance of a regressor.

72
Q

What is panel data?

A

It is the pooling of cross-section and time-series data.

73
Q

What are the advantages of panel data?

A

More information from the data set.

More degrees of freedom so data is more efficient

Able to study hetergeneity data.

Can consider dynamic changes in the data.

Estimates are less likely to be biased.

Model specific economic behaviour better

measure effects which would be difficult to see from other data.

74
Q

What are degrees of freedom?

A

The number of independent values which can be assigned to a distribution.

The higher it is, the more efficient the data is.

75
Q

What are the options to create Panel data?

A
  • Pool the data together and employ OLS regression.
  • (LSDV) Fixed effects Least squares dummy variable regression; pool the data and give each cross-section variable its own intercept dummy value.
  • Fixed effects within group; Pool data but express each variable as a deviation from a mean value. (mean corrected OLS)
  • REM (random effects model) assume that the intercepts are randomly drawn from a much larger population. Which is bigger than the scope of the model.
76
Q

What are the results of pooling OLS?

A
  • The t-stats are explanatory significant.
  • There is high R^2
  • Low DW stat (autocorrelation or misspecified model)
  • Error term loses individuality of airline.
  • Lose heterogeneity as you have no idea between difference of airline from explanatory variables.
  • Error could be correlated with regressors.
  • Estimates are biased.
77
Q

What are the effects of Pooling OLS results?

A

Their is heterogeneity causing autocorrelation. As error terms are correlated.

78
Q

Give an example of LSDV equation

A

Cit= A1 + A2D2+A3D3..+regressors and errors.

The dummy variables represent a different sub catagory in the data

Where a1 has no dummy to prevent collinearity.

airline 1 = a1

airline 2 =a1+a2

airline 3=a1+a3

Airline 2 - 6 have values than airline 1 depending on whether estimate is positive or negative.

79
Q

How can you tell the effect of the dummy variable is LSDV?

IT IS THE ONE WAY FIXED EFFECT MODEL

A

Do an f test.

H0: D=o H1: D not equal 
to 0 (therefore significant).

m = restrictions 5 when 6 dummy

k is number of regressors in sample.

if you do not reject Ho, you are saying that the pooled model is prefferable.

80
Q

What is done in the two-way fixed effect model?

A

Dummy variables are introduced to see both firm and time effect.

If 15 time periods, you have 14 time dummies.

Issues;
Loss of degrees of freedom from too many dummies.

Multicollinearity from too many dummies.

hard to see impact of non time varying variables.

Autocorrelation between firms.

81
Q

How does the fixed effect within group estimator work?

A

Take away the fixed effect, express all of the data as a deviation from the mean value.

This gives mean corrected data.

tc= B2qit + B3pfit+ B4lfit +uit.

82
Q

How to derive the fixed effect within group estimator

A

yit=b0+b1(xit) + b2 X2) +v + U

  • yit is the same as above

(Yit-Yit) = B1(Xit-Xit) Uit-Uit

When there is only 2 observations differencing the data is efficient.

T>2 and you difference you will ignore valuable information for each individual.

You subtract the specific mean for that individual from each observation.

You do the same for the mean values to get

yit = Bixit +u*it

Y* = Y - Ybar
X* = X - Xbar
83
Q

Why is it called the within estimator?

and results?

A
  • Demeaning by specific mean gets rid of need for dummy variables.
  • Takes away variation between individuals.

Eliminates Heterogeneity

Consistent slope coefficients.

84
Q

What happens with the fixed-effect model?

A

Heterogeneity is lost in the error term.

If regressors are correlated with error.

This gives biased estimates.

85
Q

How to conduct a Breush - Pagan Test?

A

Estimate the OLS model and find the residuals.

Run a regression on the residuals.

U hat = a0 +a1x1+ a2x2 +…+AkXk.

Find the R^2 for each of these.

Then use the F test

If P val

86
Q

How to calculate the Standard error of the whole regression line?

A

SQUROOT (RSS / (n-k))

This is the sigma of just the recession line.

Can then be used to find S.E: of B estimates when used in its sqaure form.