Econometrics 2 Flashcards
Time-Series Data
Data across a period of time
e.g. GDP over 20 years
Cross-Section Data
Specific time period, unit of observation is varied e.g. worker
Panel (longitudinal data)
Varies across both time and a unit e.g. countries and time
Regression analysis
Study of relationship between dependent variable and explanatory variable(s)
Estimating and/or predicting the (population) mean or average value of the dependent variable on the basis of the known or fixed values of the explanatory variables
Population Regression Line (PRL)
Gives mean value of dependent variable corresponding to each value of explanatory variable (X)
Line that passes through the conditional mean of Y
Ordinary Least Squares (OLS)
Method for estimating the unknown parameters in a linear regression model
b1 and b2 should be chosen such that the residual sum of squares (RSS) are minimised
Linear functional form
Yi = B1 + B2 Xi + ui
B2 measures the unit change in Y for a 1 unit
change in X
Log-lin model
ln Yi = B1 + B2 Xi + ui
B2 measures the relative change in Y for an
absolute change in X
Lin-log model
Yi = B1 + B2 ln Xi + ui
B2 measures the absolute change in Y for a
relative change in X
Log-linear model
lnYi = B1 + B2 ln Xi + ui
B2 measures the elasticity of Y with respect to X, that is the percentage change in Y for a
given percentage change in X
Properties of the regression line
- The regression line passes through the
sample means of X and Y - The mean value of the estimated Y
equals the mean value of the actual Y - The mean value of the residuals ei is
zero - The residuals ei are uncorrelated with
the predicted Yi - The residuals ei are uncorrelated with Xi
What is TSS
TSS = total sum of squares ESS = explained sum of squares RSS = residual sum of squares
TSS = ESS + RSS
What is r squared?
r squared is the (sample) coefficient of determination, it measures the proportion or percentage of the total variation in Y explained by the regression model
R-squared is a statistical measure of how close the data are to the fitted regression line
What are the properties of r squared?
- It is a non-negative quantity
2. Its limits are: 0 < r squared < 1
Gauss-Markov theorem
Given the assumptions of the classical linear regression model, the OLS estimators, in the class of unbiased linear estimators, have minimum variance; that is, they are BLUE
OLS estimators b1 and b2 are said to the best
linear unbiased estimators (BLUE) of B1 and
B2 if they are…
- Linear
- Unbiased
- Minimum Variance
Homoscedasticity
if all random variables in the sequence or vector have the same finite variance
also known as homogeneity of variance
Why use adjusted R squared
The problem with the multiple coefficient of
determination, R squared is that it increases as the number of explanatory variables (k) increases
What is a dummy variable
Qualitative rather than quantitative in nature
i.e. variable with no natural scale of measurement
Examples: Gender, race, religion, education level
binary i.e. equal to 1 or zero
What are the two different approaches to comparing two regressions?
Chow Test
Dummy Variable Approach
What are the advantages of dummy variable approach over chow test
for structural stability?
- Only need to estimate one regression in the dummy variable approach
- The dummy variable approach more flexible.
- Chow test does not tell us which of the coefficients have changed over time
- Pooling (under the dummy variable approach) increases the degrees of freedom
Points to note about dummy (explanatory) variables
- If a qualitative variable has m categories introduce m-1 dummy variables
Because of the dummy variable trap - The assignment of 1 and zero to the two categories is arbitrary. The important point is to know which way round they have been assigned
- The group or category that is assigned the value zero is the base, control or omitted category.
- The coefficient attached to the dummy variable D can be called the differential intercept coefficient
Classical (multiple) Linear Regression Model
assumptions
Assumption 1: E(ui | X2i , X3i … Xki) = 0
Assumption 2: cov(ui , uj) = 0, i not equal to j
Assumption 3: var(ui) = sigma squared
Assumption 4: cov(ui , X2i) = cov(ui , X3i) = … = cov(ui , Xki) =0
Assumption 5: The regression model is correctly specified
Assumption 6: ui ~ N (0 , sigma squared)
Assumption 7: No exact linear relationship between the explanatory variables
How do we judge a model to be “good”?
Parsimony Identifiability Goodness of fit Theoretical consistency Predictive power
Detection of heteroscedasticity
(a) Informal methods: Graphical method (b) Formal methods: Park test Glejser test Breusch-Pagan test Goldfeld-Quandt test White’s general heteroscedasticity test
Tests of specification errors
(a) Detecting the presence of unnecessary variables:
Individual (t-test) and joint (F-test) tests of significance
(b) Tests for omitted variables and incorrect functional forms:
Examination of residuals
Ramsey RESET test
Lagrange Multiplier (LM) test