Violations of CLRM Flashcards

1
Q

What is Autocorrelation

A

Autocorrelation, also known as serial correlation, occurs when the residuals (error terms) in a regression model are correlated across different time periods. This means that the error term of one observation is influenced by the error term of a previous observation. Autocorrelation is primarily an issue in time-series data, where observations are recorded sequentially over time.

Mathematically, autocorrelation is present if:
Cov(Ut, Ut-1) is not = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Forms of Autocorrelation

A

Positive Autocorrelation (ρ>0): Errors from one period tend to be followed by errors in the same direction. This is common in economic time series (e.g., GDP, inflation).
Negative Autocorrelation (ρ<0): Errors from one period tend to be followed by errors in the opposite direction. This occurs in alternating patterns, such as business cycle fluctuations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Causes of Autocorrelation

A

OMIID

  1. Omitted Variables: If important explanatory variables are missing, their effect may spill over into the error term, creating correlation.
  2. Misspecification of the Model: Using an incorrect functional form or excluding lagged variables can introduce autocorrelation.
  3. Inertia or Persistence in Data: Economic and financial data often exhibit trends or cycles, leading to serial correlation in errors.
  4. Incorrect Measurement of Variables: Errors in data collection can introduce patterns in residuals.
  5. Data Manipulation Issues: When data is interpolated or smoothed, it can introduce artificial correlation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Consequences of Autocorrelation

A

I^3O/U

1.Inefficient OLS Estimates: While OLS estimators remain unbiased, they are no longer the Best Linear Unbiased Estimators (BLUE) because their variance is underestimated.

2.Inconsistent Standard Errors: This leads to misleading hypothesis tests and confidence intervals, making the t-tests and F-tests unreliable.

3.Inflated R-Squared: The model may appear to fit the data well, even when it does not.

4.Over- or Underestimation of Coefficients: Serial correlation in residuals can distort coefficient estimates, affecting policy implications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the types of autocorrelation?

A
  1. Positive Autocorrelation – Errors follow the same sign, creating patterns. 2. Negative Autocorrelation – Errors alternate in sign, causing frequent fluctuations.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How is autocorrelation mathematically expressed?

A

Corr(u_t, u_{t-k}) ≠ 0 for k ≠ 0, where u_t is the error term at time t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the tests for autocorrelation?

A
  1. Durbin-Watson Test – Detects first-order autocorrelation, d ≈ 2 means no autocorrelation.
  2. Breusch-Godfrey Test – General test for higher-order autocorrelation. 3. Graphical Methods – Residual plots and Autocorrelation Function (ACF).
  3. Runs Test – Non-parametric test checking for randomness in residuals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the Durbin-Watson test measure?

A

First-order autocorrelation in regression residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the Durbin-Watson statistic interpreted?

A

d ≈ 2 → No autocorrelation. d < 2 → Positive autocorrelation. d > 2 → Negative autocorrelation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the Breusch-Godfrey test detect?

A

Higher-order autocorrelation using an auxiliary regression approach.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What graphical methods detect autocorrelation?

A

Residual plots and the Autocorrelation Function (ACF).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does the Runs Test check?

A

It tests residual randomness to detect autocorrelation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the remedial measures for autocorrelation?

A

GLADLA

  1. Generalized Least Squares (GLS) – Transforms data to correct autocorrelation. Estimates and removes serial correlation before applying OLS.
    2.Logarithmic transformation: This can help if the relationship between variables is non-linear.
    3.Adding missing variables: Including relevant variables not initially considered can capture hidden dependencies and reduce autocorrelation.
    4.Differencing: Subtracting subsequent values from the original time series removes constant trends and can alleviate autocorrelation.
    5.Including lagged variables: Adding past values of the dependent variable as predictors can account for temporal trends and autocorrelation.
    6.Using ARIMA models: These models specifically address autoregressive and integrated moving average processes, making them suitable for time series data with autocorrelation.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do GLS and Cochrane-Orcutt correct autocorrelation?

A

They estimate and remove serial correlation before applying OLS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do lagged variables or differencing help with autocorrelation?

A

They remove persistence in data trends.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does an autoregressive (AR) model address autocorrelation?

A

It explicitly models serial correlation in errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Heteroscedasticity in Econometrics?

A

Heteroscedasticity occurs when the variance of the error term in a regression model is not constant across observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the Nature of Heteroscedasticity?

A

In a homoscedastic model, error variance remains constant. In a heteroscedastic model, error variance changes with an independent variable or over time.

18
Q

What are the Causes of Heteroscedasticity?

A

Pizza Often Makes Vacation Taste Cooler

1.Presence of Outliers: Extreme values in the dataset can distort the residual variance.
2.Omitted Variable Bias: When an important explanatory variable is left out, its effect may manifest in the residuals, causing non-constant variance.
3.Mispecification of regression model: If the true relationship between dependent and independent variables is non-linear but a linear model is used, the variance of residuals may change with different levels of the explanatory variable.
4.Varied Scale of Measurement: When data involve variables measured in different units (e.g., incomes of both large corporations and small businesses), the variance of errors may differ significantly.
5.Time-Series Effects: In economic data, variance may change over time due to economic cycles, inflation, or structural breaks.
6.Cross-Sectional Heterogeneity: In datasets containing individuals, firms, or countries with significantly different characteristics, heteroscedasticity can emerge due to underlying differences in behavior.

19
Q

What are the Consequences of Heteroscedasticity?

A

IBUI

1.Inefficiency of OLS Estimators:
The OLS estimators remain unbiased but are no longer the Best Linear Unbiased Estimators (BLUE) because they do not have minimum variance (violating the Gauss-Markov theorem).
2.Biased standard errors:
Heteroscedasticity can cause the estimated variances of regression coefficients to be biased. This can lead to biased standard errors, test statistics, and confidence intervals.
3.Unreliable hypothesis tests:
Because of biased standard errors, hypothesis tests may be unreliable. For example, t-statistics may appear to be more significant than they actually are.
4.Incorrect inferences:
Incorrect inferences from data can lead to flawed business or research decisions.

20
Q

What are the Tests for Heteroscedasticity?

A
  1. Graphical Methods (Informal Tests)
    i)Residual Plot: A scatterplot of residuals (εhat) against predicted values (𝑌hat). A systematic pattern (e.g., cone shape) suggests heteroscedasticity.
    ii)Residuals vs. Independent Variables: If residual variance changes as an explanatory variable increases, heteroscedasticity may be present.
  2. Formal Statistical Tests
    i)Breusch-Pagan Test
    ii)White Test
    iii)Goldfeld-Quandt Test
    iv) Park test
    v) Glejser test
21
Q

What are the Remedial Measures for Heteroscedasticity?

A

Rob Goes Without Tea More

1.Robust Standard Errors

Use heteroscedasticity-robust standard errors (e.g., White’s standard errors) to obtain correct statistical inference without modifying the model.

2.Generalized Least Squares (GLS) and Feasible GLS (FGLS)

GLS transforms the model to stabilize variance by weighting observations appropriately.
FGLS is used when the precise form of heteroscedasticity is unknown but can be estimated.

3.Weighted Least Squares (WLS)

Assigns weights to observations inversely proportional to their estimated variance, giving more weight to observations with smaller variance.

4.Transforming Variables
Log Transformation: Applying a log transformation to the dependent variable (Y) often reduces heteroscedasticity, especially in models involving income or expenditure.
Square Root Transformation: Helps stabilize variance in count data models.

5.Model Specification Corrections
Adding omitted variables that might be causing non-constant variance.
Using interaction terms if the variance pattern suggests a non-linear relationship.

22
Q

Nature of Multicollinearity

A

Multicollinearity refers to a situation in which two or more independent variables in a regression model are highly correlated, meaning they provide redundant information. This violates the assumption of no perfect multicollinearity in the classical linear regression model (CLRM), leading to unreliable estimates of the regression coefficients.
Mathematically, multicollinearity occurs when:
𝑋𝑗 = 𝛼1𝑋1 + 𝛼2𝑋2 + … + 𝛼𝑘𝑋𝑘 + 𝑢
𝑋𝑗 = 𝛼1𝑋1 + 𝛼2𝑋2 + … + 𝛼𝑘𝑋𝑘 ( Perfect Multicollinearity)
where one independent variable (𝑋𝑗) can be expressed as a linear combination of other independent variables.

23
Q

Causes of Multicollinearity

A

HIPDD

1.Inclusion of Highly Correlated Variables: When two or more independent variables measure similar phenomena (e.g., GDP and income).
2.Insufficient Data Variation: When sample data does not vary enough (e.g., due to a short time period).
3.Overuse of Polynomial or Interaction Terms: Including squared or interaction terms in the model can introduce artificial correlation.
4.Data Collection Errors: Inaccurate or missing data can inflate correlations between variables.
5.Dummy Variable Trap: Using all categories of a categorical variable instead of omitting one as a reference category.
6.Aggregation of Data: Combining similar groups in a way that increases correlation among predictors.

24
Consequences of Multicollinearity
ICUR 1.Inflated Standard Errors: Makes estimates of individual coefficients unstable. 2.Unreliable t-Statistics: High multicollinearity leads to large p-values, making it difficult to determine which variables are significant. 3.Conflicting Results: Some variables may have unexpected signs or magnitudes despite strong theoretical backing. 4.Reduced Predictive Power: A model with multicollinearity might not generalize well to new data.
25
Tests for Multicollinearity
CVHC 1.Correlation Matrix: If two independent variables have a correlation coefficient above 0.8 or 0.9, multicollinearity is likely. 2.Variance Inflation Factor (VIF): Measures how much the variance of a coefficient is inflated due to correlation. IF VIF > 10, multicollinearity is a serious concern. 3.High R square but insignificant t values. 4.Condition Index: If the largest condition index (based on eigenvalues) exceeds 30, multicollinearity is problematic.
26
Remedial Measures for Multicollinearity
RICCR 1.Removing Highly Correlated Variables: Drop one of the correlated variables if it does not add unique value to the model. 2.Combining Variables: Create an index or principal component analysis (PCA) to replace highly correlated predictors. 3.Increasing Sample Size: More observations may help reduce collinearity. 4.Centering Variables: If polynomial terms cause multicollinearity, mean-centering variables (X− Xbar) can help. 5. Using Ridge Regression: In cases where removing variables is not feasible, ridge regression applies a penalty to the coefficient estimates to reduce variance.
27
Breusch-Pagan Test
Regresses squared residuals on independent variables. If explanatory variables significantly explain error variance → Heteroscedasticity present. Best for: Detecting systematic heteroscedasticity in linear regression models.
28
White Test
Extends Breusch-Pagan by including squared and interaction terms. Does not assume a specific functional form. Best for: Detecting both linear and non-linear heteroscedasticity.
29
Glejser Test
Regresses absolute residuals on independent variables. Significant coefficients suggest systematic heteroscedasticity. Best for: Detecting heteroscedasticity linked to proportional relationships.
30
Park Test
✅ Assumes variance follows a log-linear function of an independent variable. ✅ Regresses log of squared residuals on log of an independent variable. 📌 Best for: Detecting heteroscedasticity when a power-function relationship exists.
31
Goldfeld-Quandt Test
✅ Splits data into two groups, omits middle observations, and compares residual variance. ✅ If variance differs significantly → Heteroscedasticity present. 📌 Best for: Detecting heteroscedasticity when variance changes across subgroups.
32
Definition of Non-Stationarity
Non-stationarity refers to a statistical property of a time series where its mean, variance, or autocovariance structure changes over time. This means that the statistical characteristics of the series are not constant, making traditional regression analysis unreliable.
33
Nature of Non-Stationarity
1.Trend Stationarity: The series has a deterministic trend (e.g., linear or quadratic), but if the trend is removed, it becomes stationary. 2.Difference Stationarity (Unit Root Process): The series follows a stochastic trend, meaning differencing is required to make it stationary. 3.Structural Breaks: A sudden shift in the mean or variance due to policy changes, economic crises, or technological advancements.
34
Causes of Non-Stationarity
ESPER 1.Economic growth: Time series data like GDP, inflation, and population often grow over time. 2.Seasonality: Patterns repeating at regular intervals (e.g., quarterly or monthly variations). 3.Policy changes: Introduction of new regulations or economic policies can alter the statistical properties. 4.External shocks: Wars, financial crises, and pandemics can create sudden structural breaks. 5.Random Walks: Some economic variables follow a unit root process (e.g., stock prices).
35
Consequences of Non-Stationarity
SUDI 1.Spurious Regression: High R squared and statistically significant coefficients, but no meaningful economic relationship. 2.Unreliable Hypothesis Testing: The standard errors and t-statistics become misleading. 3.Incorrect Model estimation: The estimated model does not correctly represent the underlying economic process. 4. Difficulty in estimating a long run relationship
36
Tests for Non-Stationarity
APVC 1.Augmented Dickey-Fuller (ADF) Test – Tests for a unit root by checking whether the time series can be expressed as a stationary process. 2.Phillips-Perron (PP) Test – Similar to ADF but robust to autocorrelation and heteroskedasticity. 3.Variance Ratio Test – Checks whether the variance grows over time, indicating a random walk. 4.Correlelogram
37
Remedial Measures for Non-Stationarity
1. Differencing: Taking the first (or higher-order) difference of a time series to remove a unit root.​ Yt' = Yt - Yt-1 If first differencing is not enough, second differencing may be needed. 2. Transformation: Applying logarithms or Box-Cox transformations to stabilize variance. 3. Detrending: Removing deterministic trends using regression models. 4. Using Error Correction Models (ECM): If variables are cointegrated, ECMs correct short-run deviations while maintaining the long-run relationship. 5. Structural Break Tests: If non-stationarity is due to a structural break, methods like Chow Test can help identify and adjust for breakpoints.
38
Nature of Non-Linearity
Non-linearity in econometrics refers to a situation where the relationship between the dependent variable and independent variables cannot be adequately described by a linear function. Instead of a straight-line relationship, the equation may involve exponents, logarithms, or interaction terms.
39
Causes of Non-Linearity
TIMB 1.True Non-Linearity: The underlying economic relationship is genuinely non-linear (e.g., diminishing marginal returns). 2.Incorrect Functional Form: Using an incorrect specification that ignores potential squared terms, interaction effects, or log transformations. 3.Multiplicative Relationships: When variables interact in a multiplicative rather than additive manner. 4.Behavioral or Structural Factors: Certain economic behaviors or market structures may follow non-linear trends.
40
Consequences of Non-Linearity
MPIF 1.Misspecification Bias: The estimated coefficients may be incorrect or misleading. 2.Poor Goodness of Fit: Low 𝑅 squared, indicating that the model does not capture the data well. 3.Incorrect Hypothesis Testing: t-tests and F-tests may give misleading results. 4.Forecasting Errors: Poor predictive performance due to an incorrect functional form.
41
Tests for Non-Linearity
1.Ramsey RESET Test: Checks whether omitted nonlinear terms improve model fit. 2.Graphical Methods: Scatter plots of residuals against independent variables may reveal patterns. 3.Box-Cox Transformation Test: Helps determine the most appropriate transformation. 4.Partial Residual Plots: Shows whether a linear assumption holds.
42
Remedial Measures for Non-Linearity
TPNI 1.Transform the Variables: Use logarithms, squared terms, or other transformations to linearize the relationship (e.g., log-log, semi-log models). 2.Polynomial Regression: Introduce higher-order terms (𝑋^2, 𝑋^3 to capture curvature. 3.Non-Linear Estimation Techniques: Use methods such as Nonlinear Least Squares (NLS). 4.Interaction Terms: Incorporate interaction terms if relationships vary across groups.
43
Spearman rank for heretoskedasticity
A Spearman rank correlation test can be used to detect heteroscedasticity in a regression model by calculating the correlation between the ranks of the absolute residuals and the ranks of the independent variable, where a significant correlation indicates the presence of heteroscedasticity;