RS cards Flashcards by Jedde Reuters

What is homoskedacity?

Errors with a constant variance

How well did you know this?

Not at all

Perfectly

What is heterkoskedacity, and what does it cause?

Errors that do not have a constant variance. OLS estimator will still be unbiased, but will not be the best linear unbiased estimator, other models will probably have lower variance.

How well did you know this?

Not at all

Perfectly

Name the assumptions behind the best linear unbiased estimator (BLUE)?

1 Mean of Zero and Independence: The error terms u(i) have a mean of zero and are independent of the independent variables X, that is, E(u(i) | X) = 0.

2 Constant Variance (Homoskedasticity): The error terms u(i) have constant variance, denoted as var(u(i)) = σ^2.

3 No Autocorrelation (Serially Uncorrelated): The error terms u(i) are serially uncorrelated, meaning E(u(i) u(j)) = 0 for i ≠ j (there is no autocorrelation).

4 No Perfect Multicollinearity: There is no exact linear relationship between the independent variables.

→ Under these assumptions, the Ordinary Least Squares (OLS) estimator beta-hat is the best linear unbiased estimator (BLUE) for beta.
→ Routinely computed standard errors, t-statistics, etc., are correct.

Optionally:

5 Normality of Error Terms: The error terms u(i) follow a normal distribution.

→ The estimator beta-hat has a normal distribution too (if not, it is approximately normal for large samples).

What if you don’t meet the criteria?

Mean of Zero and Independence (E(u(i) | X) = 0):

Consequence: If this assumption is violated, it means that there is a systematic error in the model. This often indicates omitted variable bias, where important explanatory variables are missing from the model, causing the error terms to be correlated with the independent variables. As a result, the OLS estimators will be biased and inconsistent, leading to unreliable regression coefficients.
Constant Variance (Homoskedasticity) (var(u(i)) = σ^2):

Consequence: Violation of this assumption leads to heteroskedasticity, where the variance of the error terms changes across observations. In such cases, the OLS estimators remain unbiased and consistent, but they are no longer efficient. This inefficiency means that there are other estimators with smaller variances. Moreover, the standard errors of the OLS estimators will be biased, leading to unreliable hypothesis tests (like t-tests) and confidence intervals.
No Autocorrelation (Serially Uncorrelated) (E(u(i) u(j)) = 0 for i ≠ j):

Consequence: When this assumption is violated, it results in autocorrelation (also known as serial correlation), commonly occurring in time series data. The presence of autocorrelation, like heteroskedasticity, does not cause bias in the OLS estimators, but it does make them inefficient. Additionally, it leads to biased standard error estimates, resulting in misleading hypothesis tests and confidence intervals.
No Perfect Multicollinearity:

Consequence: Multicollinearity occurs when two or more independent variables in a regression model are highly correlated. Perfect multicollinearity, which violates this assumption, makes it impossible to estimate the regression coefficients uniquely, as it becomes unclear how to attribute the effect on the dependent variable among the highly correlated independent variables. In practice, even high (but not perfect) multicollinearity can cause issues, such as inflated standard errors and unstable coefficient estimates, making it difficult to assess the effect of each independent variable.
Optional:

Normality of Error Terms (u(i) follows a normal distribution):
Consequence: If error terms are not normally distributed, the primary consequence is on inference—specifically, hypothesis tests and confidence intervals for the regression coefficients may not be valid. The Central Limit Theorem assures that for large sample sizes, the distribution of the OLS estimators will be approximately normal even if the errors are not. However, for smaller samples, non-normality can significantly impact the validity of these statistical tests.
In summary, violation of these assumptions can lead to various problems, such as biased or inefficient estimators, incorrect standard errors, and invalid hypothesis tests, undermining the reliability and validity of the regression analysis. It’s important to perform diagnostic checks and consider remedial measures or alternative estimation methods when these assumptions are violated.

How well did you know this?

Not at all

Perfectly

Describe data transformation, and why its used

typically log transformation. Used to pull extreme values to the mean. Used for ratois vand variables with positive skewness that do not take negative values

How well did you know this?

Not at all

Perfectly

What is an advantage of using normal returns rather than ln(returns)?

With normal returns, you can calculate the return of a portfolio by w1*rt1 + w2 * rt2 etc

How well did you know this?

Not at all

Perfectly

What is an advantage of log returns?

Multi-period returns are simply the sum of sinlge period returns

How well did you know this?

Not at all

Perfectly

What are the steps of an event study?

Identify event and select sample firms
Determine length of event window
Define estimation period (pre-event window)
Calculated ‘normal’ (expected) returns
Construct (cumulative) abnormal returns (CAR)
Determine statistical significance of CAR
Analyze patters in CARs (regressions, subsamples, etc.)

How well did you know this?

Not at all

Perfectly

In an event study, what are the ways to calculate expected returns?

omputing Expected Returns

Event studies aim to estimate ‘abnormal’ return, AR(t), occurring around a specified corporate event.
R(t) = E[R(t)|I(t)] + AR(t)

R(t) is the actual return during the event period.
E[R(t)|I(t)] is the expected or ‘normal’ return given information I(t).
Common approaches to compute expected return:

Constant-mean-return model: E[R(t)] = mu
Take historical average return as a proxy for ‘normal’ stock return.

Market-adjusted-return model: E[R(t)] = R(m)(t)
Take market return as a proxy for ‘normal’.
Imprecise but useful if no pre-event data is available to estimate model parameters.

Market model: E[R(t)] = alpha(t) + beta(t) R(m)(t)
Alpha(t) and Beta(t) are estimated over an estimation window disjunct from the event window to avoid contamination (time series regression).

Choice of estimation window: trade-off between precision and timeliness; use half a year or one year of daily data.

How well did you know this?

Not at all

Perfectly

Why do cross-sectional analysis of CARs in an event study?

Cross-sectional variation in firm-specific CARs can be large, and hidden in the average CAR. Think of large differences in stock returns after earnings for good news and bad news firms, that cancel each other out.

How well did you know this?

Not at all

Perfectly

What do we do with a null hypothesis if T value is larger than 1.96 or lower than =1.96?

(Strongly) reject that there is no impact. Do not reject that the impact is insignificant!

How well did you know this?

Not at all

Perfectly

What is a cross-sectional regression for CARs relating to firm characteristics?

Explaining whether firms with certain characteristics have stronger/weaker CARs. So perhaps Firm size relates to CARs for a certain event study. (e.g., CAR is lower for larger firms due to better monitoring by analysts). Or that CEO replacement has more impact for a small company.

How well did you know this?

Not at all

Perfectly

What are the drawbacks of event studies?

We assume that benchmark model used to compute expected returns is correct → if it is misspecified, AR will be incorrect
We assume that event windows of firms do not overlap
a. Intuition: if event windows overlap (event clustering), observations (ARs and CARs) are not
independent across securities and typically variance of 𝐶𝐴𝑅 HHHHHH is underestimated
b. Solution: form portfolios
We assume that firm’s beta (systematic risk) remains constant → but beta may change due to
corporate events
Choice of event window and estimation window is a bit arbitrary → use various lengths to check
robustness of results
We assume that abnormal returns follow normal distribution → use non-parametric tests

How well did you know this?

Not at all

Perfectly

What are things to worry about with event studies

Anticipated events: information may be leaked to the market in advance of an announcement.
(Firms may also time the announcement as a function of recent stock returns)
Confounding effects: other events may happen (more or less) simultaneously (e.g., earnings
announcement plus CEO change)
Event day uncertainty
Thin or non-synchronous trading: stock prices to calculate returns may not correspond to same
time within the day, if stock is thinly traded
Event-induced variance: volatility of returns is assumed the same in event and non-event periods.
(If not, standard test is invalid)
Clustering/cross-sectional dependence: if event windows overlap, t-tests may reject too often,
because abnormal returns exhibit small correlations across securities

How well did you know this?

Not at all

Perfectly

What is important for long-horizon event studies?

Method selection, e.g., proper measurement of volatility of the abnormal returns and a proper method for calculating expected returns (fama french 3-4 factor model)

How well did you know this?

Not at all

Perfectly

What is the best way to calculate long-horizon event study returns?

BAHRS (Buy and hold abnormal returns), which includes compounding

How well did you know this?

Not at all

Perfectly

Long-horizon event studies issues

With long-horizon, accurate estimate of expected returns is important. Misestimating for example beta can drasticly impact your outcome.

Finite-sample test-statistics have lots of problems:
o Cross-correlation (many event windows overlap in time due to long event windows, causing
cross-correlation)
o Skewness (long run return of stock is positively skewed due to compounding, long run return
of (market) portfolio is not)
o These issues can lead to large biases in t-statistics

Describe correlation

Correlation measures strength and direction of linear dependence between two variables x and y.

Properties:
o Always between -1 and +1
o 𝐶𝑂𝑅𝑅 indicates fraction of variance of y explained by variance of x
o Large correlation (in absolute value) means stronger relation
o Symmetric for both variables
- Correlation is a general (average) pattern, does not necessarily hold for all observations (outliers)
- Zero correlation means that there is no linear relation between x and y. Does not imply independence

What can impact r2?

The way the dependent variable is defined (changed versus level, wage versus log wages) -> r2s aren’t easily comparable along models, only if they have the same dependent variable
R2 always increases by adding more variables, even if they are irrelevant

What can reduce accuracy of OLS estimate?

-large error variance s2
-Small number of observations n
-Little spread in independent variable (so limited variation in x), or too much variation with outliers

Explain OBS (omitted variable bias) and its signs and effects

If not corrected for effect of other variables explaining y, regression coefficients can be biased, up or downwards
Problem more severe when the x variable in regression has high correlation with omitted variable z
Sign of omitted variables bias depends on:
o Sign of correlation between y and z
o Sign of correlation between x and z
Hard to detect with statistics; economic intuition is crucial
Solution: correct for confounding factors by adding multiple independent variables to model → multivariate regression

Describe firm fixed effects and why you would use them

Firm fixed effects involve including dummy variables for each firm in the dataset. These dummy variables capture unobservable firm-specific characteristics that remain constant over time but may influence the outcome variable. By including firm fixed effects in a regression model, we are essentially controlling for differences across firms that are constant over time.
Firm fixed effects are useful when there are unobservable factors related to individual firms that may bias the estimation if not controlled for. For example, in finance, firm fixed effects can help account for differences in managerial quality, brand reputation, or other unmeasured firm-specific characteristics that may affect financial outcomes.

Describe time fixed effects and why you would use them

Time fixed effects involve including dummy variables for each time period (e.g., year, quarter) in the dataset. These dummy variables capture unobservable shocks or trends that affect all firms uniformly within a specific time period.

So: a fixed effect that affects the dependent variable over time, but has the same impact for each firm.

Time fixed effects are useful when there are common time-specific factors or shocks that affect all firms in the dataset but may not be explicitly measured or controlled for. For example, in finance, time fixed effects can help account for macroeconomic conditions, industry-wide shocks, or changes in regulatory environments that affect financial outcomes for all firms within a given time period.

Differences for firm fixed effects and time fixed effects

Firm fixed effects are typically used when the focus is on examining differences in outcomes across firms while controlling for unobservable firm-specific characteristics. For example, in studying the effect of corporate governance on firm performance, firm fixed effects would be appropriate to control for unobserved firm-specific factors. Typically used for cross-panels.

So, an firm-specific effect that stays the same over time.

Time fixed effects are useful when the focus is on examining changes in outcomes over time while controlling for common time-specific shocks or trends affecting all firms. For instance, in analyzing the impact of policy changes on stock returns, time fixed effects would help control for overall market movements or economic cycles.
In many cases, both firm fixed effects and time fixed effects may be included simultaneously in regression models to control for both firm-specific and time-specific factors, providing a more robust estimation of the relationship between variables of interest.

Describe heteroskedasticity causes, consequences, testing and solutions

Possible causes:
o Time series: error variance higher in crisis periods
o Cross section: error variance higher for large firms
Consequences:
o Usual standard errors and test statistics no longer valid
o No impact on coefficients estimates (no bias)
Testing:
o Visual inspection: check for changes in residual variance
o Statistical tests: Breusch-Pagan or White test → after
estimating regression in Stata, type estat hettest or estat
imtest, white
Solutions:
o Use White-corrected standard errors (Stata: option vce(robust))
o Use log transform or scale variable by measure of size

Describe autocorrelation causes, consequences, testing and solutions

- Possible causes: o Seasonality effects: unemployment lower in summer o Lead/lag effects: under/overaction to news o Often: some kind of model misspecification - Consequences: o Usual standard errors and test statistics no longer valid § Positive autocorr: s.e. understated (t-stats too big) § Negative autocorr: s.e. overstated (t-stats too small) - Testing: o Visual inspection: check for cyclicality or reversal in residuals o Statistical tests: Breusch-Godfrey tests → after estimating in Stata, type: estat bgodfrey, lag(x) - Solutions: o Add lagged dependent or independent variables to better capture dynamics o Include dummy variables to capture seasonal effects o Use Newey-West standard errors (Stata: use newey instead of regress)

Describe multicollinearity causes, consequences, testing and solutions

- Cause: o Two or more explanatory variables are highly correlated - Consequences: o Low t-stats (high standard errors) for individual coefficients but reasonable 𝑅^2(joint significance) o Weird signs or magnitude of coefficient estimates - Checking (not testing): o Compute pairwise correlation matrix of independent variables o Compute Variance Inflation Factor (VIF) → rule of thumb: collinearity issue when VIF > 10 (but depends on sample) § Stata: after estimating regression, type: vif § Do not apply this rule this mechanically - Solutions: o Ignore if it does not affect coefficient of interest or if you do not care about significance of individual coefficients o Drop one variable: can lead to omitted variable bias o Collect more data to increase accuracy of coefficient estimates

Describe non-normality causes, consequences, testing and solutions

- Causes: extreme observations (crisis), bounded dependent variable (price), binary dependent variable (dummy), discrete dependent variable (count) - Consequences: o Large sample: no problem (central limit theorem) o Small sample: inference about coefficients wrong (t-test invalid) o No impact on coefficient estimates (no bias or inconsistency) - Testing: o Jarque-Bera (JB) statistic: uses property that normal distribution is characterized by only mean and variance → skewness (asymmetry) and excess kurtosis (fat tails) are zero o Stata: after estimating regression: type: sktest myresiduals - Solutions: o Winsorize or truncate (=trim) outliers (or include dummy) o Use log transformation of variables that are obviously non-normal o Use other regression mode: tobit, probit/logit, poisson

What is an continuous interaction term?

Interaction terms allow coefficients on one variable to change with the level of another variable. Suppose you believe that governance matters for firm value on average, but much more when competitive pressure is low: Q = 𝛼 + 𝛽1GOV + 𝛽2COMP + 𝛽3*(GOV*COMP) + 𝜀 𝛽1 is the effect of governance on Q when competition is zero. 𝛽3 is the incremental effect of governance on Q when competition increases by 1 unit. Note: When using an interaction variable, always include the two variables separately as well! If you exclude the direct effect, the coefficient on the interaction term will be biased.

How do firm fixed effects and time fixed effects allow intercepts to vary?

For firm fixed effects, intercepts varies across firms. For time fixed effects, intercept varies across time. @Luuk kun je foto toevoegen van pagina 19 samenvatting ''research skills lectures''?

When esmang a regression model, you can choose between the regular standard errors, White standard errors, and Newey-west standard errors. Which of these allow for serial correlaon?

Newey-west standard errors

What are robust standard errors used for?

Heteroskedasticity

Which of the following is a limitaon of the Fama-MacBeth regression approach?

It does not allow for serial correlaon (within rms, across prediods)

If the error term in a linear regression model is not normally distributed…

We need to rely on asymptotic theory to perform valid tests

Which assumption is NOT essential for routinely calculated standard errors to be correct? - The error terms are homoscedastic - The error terms are serially uncorrelated - The error terms are normally distributed - All three assumptions are essential

- The error terms are normally distributed

. When estimating a panel model with firm fixed effects… - We obtain more precise estimates of the slope coefficients - We cannot include firm-invariant explanatory variables - We cannot include time-invariant explanatory variables - We cannot use standard errors clustered by firm

- We cannot include time-invariant explanatory variables

What is (are) the main reason(s) to include firm fixed effects in a panel regression? - Improving precision of the estimation of the slope coefficients - Obtaining appropriate standard errors for the slope coefficients - Controlling for time-invariant firm-specific factors - Reducing bias in the estimation of the slope coefficients

- Controlling for time-invariant firm-specific factors - Reducing bias in the estimation of the slope coefficients

What is vital with interaction terms?

Always include the variables individually

What is a general solution for a.o. heteroskedasticity?

Just log it!