ECONOMETRICS Flashcards
What is econometrics?
Econometrics is the branch of economics that applies statistical methods to estimate economic relationships. It helps in testing theories and making predictions.
What is the difference between deterministic and stochastic relationships?
Deterministic: A relationship where the outcome is exactly determined by the inputs (e.g., Force = Mass × Acceleration).
Stochastic: A relationship where the outcome has some randomness (e.g., House Price = B1 + B2 × Size + error term).
What is the population regression function (PRF)?
The PRF represents the true relationship between the dependent variable (Y) and independent variables (X’s) in the entire population:
Yi = B1 + B2X2i + B3X3i + ….. + BkXki +ui
What is the sample regression function (SRF)?
The SRF is the estimated version of the PRF based on sample data: Yi = b1 + b2X2i + b3X3i + ….. + bkXki +u^i
Here, b1,b2 and b3 are estimated coefficients.
What is Y^ (Y-hat) in regression?
Y^ is the predicted value of
𝑌
Y using the estimated regression equation:
Y = b1 + b2X2i + …..+ bkXk
What is the method of Ordinary Least Squares (OLS)?
OLS is a method used to estimate regression coefficients by minimizing the sum of squared errors (SSE).
What is the formula for the sum of squared errors (SSE)?
SSE = \sum (Y_i - Y^_i)^2
It measures how far off our predictions are from actual values.
Why do we prefer OLS?
OLS provides unbiased coefficient estimates with the smallest variance among all unbiased methods.
What is R -squared?
measures how well the independent variables explain the variation in the dependent variable.
Formula:
R^2 = 1 - (SSE/SST)
It ranges from 0 to 1, where higher values indicate a better fit.
What is adjusted
R^2?
Adjusted R^2
adjusts for the number of predictors in the model. Unlike R^2 , it penalizes unnecessary variables.
What is the ANOVA F-statistic used for?
The F-statistic tests whether at least one independent variable is significantly related to
𝑌.
If the p-value is small, at least one
X variable significantly affects 𝑌
What is the formula for the F-statistic?
F = (MSR / MSE)
MSR (Mean Square Regression) measures explained variance.
MSE (Mean Square Error) measures unexplained variance.
How do you test if a regression coefficient is significant?
t = (bk / SE (bk) )
Compare with critical value or use the p-value.
How do you interpret regression coefficients?
If X is in natural log: A 1% increase in
X leads to a B% change in
Y.
If X is a dummy variable (0 or 1): It shows the difference in
Y between the two groups.
What is a confidence interval for a regression coefficient?
The confidence interval gives a range in which the true coefficient likely falls.
bj +/- t critical * SE (bk)
What are the assumptions needed for OLS to be BLUE?
BLUE = Best Linear Unbiased Estimator
Linearity:
𝑌
Y is a linear function of
𝑋
X.
No omitted variables: Model includes all relevant factors.
No perfect multicollinearity:
𝑋
X variables are not perfectly correlated.
Zero mean error: Expected value of residuals is zero.
Homoscedasticity: Errors have constant variance.
No autocorrelation: Residuals are not correlated with each other.
What is the Chow test?
The Chow test checks whether two groups of data (e.g., different time periods) have the same regression coefficients.
What is a restricted least squares F-test?
It compares a restricted model (fewer variables) to an unrestricted model (all variables) to see if removing variables significantly worsens the model.
What is a subset F-test?
A subset F-test checks whether a group of independent variables (X’s) in a regression model can be removed without losing important information.
How would you predict house prices using regression?
Collect data on house prices (
𝑌
Y) and predictors (
𝑋
X) such as square footage, number of bedrooms, etc.
Estimate regression coefficients using OLS.
Predict prices for new homes using:
What is multiple regression?
Multiple regression extends simple regression by including more than one independent variable to explain the dependent variable.
Example:
Y_i = B1 + B2X2_i + B3X3_i + u_i
How do we interpret coefficients in a multiple regression model?
B2: The change in Y, on average, for a one-unit increase in X2, holding X3 constant.
B3: The change in Y, on average, for a one-unit increase in X3, holding X2 constant.
B1: The predicted value of Y when all X’s are zero.
Why is multiple regression useful?
It allows us to isolate the effect of one variable on Y while holding other factors constant.
What is the difference between simple and multiple regression graphs?
Simple regression: A straight line on a scatterplot.
Multiple regression: A plane (if 2 X’s) or a hyperplane (more than 2 X’s), which is harder to visualize.