RS cards Flashcards
What is homoskedacity?
Errors with a constant variance
What is heterkoskedacity, and what does it cause?
Errors that do not have a constant variance. OLS estimator will still be unbiased, but will not be the best linear unbiased estimator, other models will probably have lower variance.
Name the assumptions behind the best linear unbiased estimator (BLUE)?
1 Mean of Zero and Independence: The error terms u(i) have a mean of zero and are independent of the independent variables X, that is, E(u(i) | X) = 0.
2 Constant Variance (Homoskedasticity): The error terms u(i) have constant variance, denoted as var(u(i)) = σ^2.
3 No Autocorrelation (Serially Uncorrelated): The error terms u(i) are serially uncorrelated, meaning E(u(i) u(j)) = 0 for i ≠ j (there is no autocorrelation).
4 No Perfect Multicollinearity: There is no exact linear relationship between the independent variables.
→ Under these assumptions, the Ordinary Least Squares (OLS) estimator beta-hat is the best linear unbiased estimator (BLUE) for beta.
→ Routinely computed standard errors, t-statistics, etc., are correct.
Optionally:
5 Normality of Error Terms: The error terms u(i) follow a normal distribution.
→ The estimator beta-hat has a normal distribution too (if not, it is approximately normal for large samples).
What if you don’t meet the criteria?
Mean of Zero and Independence (E(u(i) | X) = 0):
Consequence: If this assumption is violated, it means that there is a systematic error in the model. This often indicates omitted variable bias, where important explanatory variables are missing from the model, causing the error terms to be correlated with the independent variables. As a result, the OLS estimators will be biased and inconsistent, leading to unreliable regression coefficients.
Constant Variance (Homoskedasticity) (var(u(i)) = σ^2):
Consequence: Violation of this assumption leads to heteroskedasticity, where the variance of the error terms changes across observations. In such cases, the OLS estimators remain unbiased and consistent, but they are no longer efficient. This inefficiency means that there are other estimators with smaller variances. Moreover, the standard errors of the OLS estimators will be biased, leading to unreliable hypothesis tests (like t-tests) and confidence intervals.
No Autocorrelation (Serially Uncorrelated) (E(u(i) u(j)) = 0 for i ≠ j):
Consequence: When this assumption is violated, it results in autocorrelation (also known as serial correlation), commonly occurring in time series data. The presence of autocorrelation, like heteroskedasticity, does not cause bias in the OLS estimators, but it does make them inefficient. Additionally, it leads to biased standard error estimates, resulting in misleading hypothesis tests and confidence intervals.
No Perfect Multicollinearity:
Consequence: Multicollinearity occurs when two or more independent variables in a regression model are highly correlated. Perfect multicollinearity, which violates this assumption, makes it impossible to estimate the regression coefficients uniquely, as it becomes unclear how to attribute the effect on the dependent variable among the highly correlated independent variables. In practice, even high (but not perfect) multicollinearity can cause issues, such as inflated standard errors and unstable coefficient estimates, making it difficult to assess the effect of each independent variable.
Optional:
Normality of Error Terms (u(i) follows a normal distribution):
Consequence: If error terms are not normally distributed, the primary consequence is on inference—specifically, hypothesis tests and confidence intervals for the regression coefficients may not be valid. The Central Limit Theorem assures that for large sample sizes, the distribution of the OLS estimators will be approximately normal even if the errors are not. However, for smaller samples, non-normality can significantly impact the validity of these statistical tests.
In summary, violation of these assumptions can lead to various problems, such as biased or inefficient estimators, incorrect standard errors, and invalid hypothesis tests, undermining the reliability and validity of the regression analysis. It’s important to perform diagnostic checks and consider remedial measures or alternative estimation methods when these assumptions are violated.
Describe data transformation, and why its used
typically log transformation. Used to pull extreme values to the mean. Used for ratois vand variables with positive skewness that do not take negative values
What is an advantage of using normal returns rather than ln(returns)?
With normal returns, you can calculate the return of a portfolio by w1*rt1 + w2 * rt2 etc
What is an advantage of log returns?
Multi-period returns are simply the sum of sinlge period returns
What are the steps of an event study?
- Identify event and select sample firms
- Determine length of event window
- Define estimation period (pre-event window)
- Calculated ‘normal’ (expected) returns
- Construct (cumulative) abnormal returns (CAR)
- Determine statistical significance of CAR
- Analyze patters in CARs (regressions, subsamples, etc.)
In an event study, what are the ways to calculate expected returns?
omputing Expected Returns
Event studies aim to estimate ‘abnormal’ return, AR(t), occurring around a specified corporate event.
R(t) = E[R(t)|I(t)] + AR(t)
R(t) is the actual return during the event period.
E[R(t)|I(t)] is the expected or ‘normal’ return given information I(t).
Common approaches to compute expected return:
Constant-mean-return model: E[R(t)] = mu
Take historical average return as a proxy for ‘normal’ stock return.
Market-adjusted-return model: E[R(t)] = R(m)(t)
Take market return as a proxy for ‘normal’.
Imprecise but useful if no pre-event data is available to estimate model parameters.
Market model: E[R(t)] = alpha(t) + beta(t) R(m)(t)
Alpha(t) and Beta(t) are estimated over an estimation window disjunct from the event window to avoid contamination (time series regression).
Choice of estimation window: trade-off between precision and timeliness; use half a year or one year of daily data.
Why do cross-sectional analysis of CARs in an event study?
Cross-sectional variation in firm-specific CARs can be large, and hidden in the average CAR. Think of large differences in stock returns after earnings for good news and bad news firms, that cancel each other out.
What do we do with a null hypothesis if T value is larger than 1.96 or lower than =1.96?
(Strongly) reject that there is no impact. Do not reject that the impact is insignificant!
What is a cross-sectional regression for CARs relating to firm characteristics?
Explaining whether firms with certain characteristics have stronger/weaker CARs. So perhaps Firm size relates to CARs for a certain event study. (e.g., CAR is lower for larger firms due to better monitoring by analysts). Or that CEO replacement has more impact for a small company.
What are the drawbacks of event studies?
- We assume that benchmark model used to compute expected returns is correct → if it is misspecified, AR will be incorrect
- We assume that event windows of firms do not overlap
a. Intuition: if event windows overlap (event clustering), observations (ARs and CARs) are not
independent across securities and typically variance of 𝐶𝐴𝑅 HHHHHH is underestimated
b. Solution: form portfolios - We assume that firm’s beta (systematic risk) remains constant → but beta may change due to
corporate events - Choice of event window and estimation window is a bit arbitrary → use various lengths to check
robustness of results - We assume that abnormal returns follow normal distribution → use non-parametric tests
What are things to worry about with event studies
- Anticipated events: information may be leaked to the market in advance of an announcement.
(Firms may also time the announcement as a function of recent stock returns) - Confounding effects: other events may happen (more or less) simultaneously (e.g., earnings
announcement plus CEO change) - Event day uncertainty
- Thin or non-synchronous trading: stock prices to calculate returns may not correspond to same
time within the day, if stock is thinly traded - Event-induced variance: volatility of returns is assumed the same in event and non-event periods.
(If not, standard test is invalid) - Clustering/cross-sectional dependence: if event windows overlap, t-tests may reject too often,
because abnormal returns exhibit small correlations across securities
What is important for long-horizon event studies?
Method selection, e.g., proper measurement of volatility of the abnormal returns and a proper method for calculating expected returns (fama french 3-4 factor model)
What is the best way to calculate long-horizon event study returns?
BAHRS (Buy and hold abnormal returns), which includes compounding