Regression, event, panel, portfolio Flashcards

1
Q

Regression analysis shows correlations, not causal relationships. Why?

A

Because the direction or nature of causality depends on a solid theory, not just statistical modeling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain Exogeneity of covariates

A

That the error u is not a function of X.

  • The covariates (independent variables) don’t contain any information that predicts the error term (u).
  • This ensures that the model is correctly specified, and the independent variables only explain the dependent variable, not the errors.
  • For every data point, the expected value of the error term, given the independent variables, is zero.
  • This ensures that errors are purely random and not systematically related to the covariates.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is endogeneity?

A

Endogeneity is when the error term u is related to the independent variables → biased and inconsistent estimates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is homoskedasticity?

A

aka constant variance assumption.
Assumes that errors are uncorrelated: the covariance between any error terms is zero. The errors are evenly distributed. When errors are uncorrelated, it ensures that the error terms are independent, meaning one error does not affect the other.

Homoskedasticity ensures that the regression model treats all observations equally. If the variance changes (heteroskedasticity), it can lead to inefficient or biased estimates.

The error u has the same variance given any value of the explanatory variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What can you say about data generating process of covariates and errors?

A

The data in X can be a combination of constant and random variables

  • OLS relies on variance in the covariates to estimate the relationship between independent variables and the dependent variable.
  • If a covariate doesn’t vary (e.g., all values are the same), OLS cannot estimate its effect because it has no explanatory power.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the exogeneity assumption say?

A

The error term u is unrelated to the independent variables X.

It ensures that the model captures the true relationship between X and Y without bias from omitted variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which are the OLS derivations?

A
  1. standard errors
  2. the t-test
  3. goodness-of-fit (rsquare)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is standard errors?

A

Standard errors tell us how much the model’s predictions and estimates (like the coefficients) might vary due to random noise or limited data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is residual variance?

A

Measures how far the actual data points are from the model’s prediction on average (tells how much error is left after fitting the regression line).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the residual standard error?

A

The average size of the errors in the model’s predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the t-test?

A

T-tests are used in regression to check if a regression coefficient B is significantly different from zero. It helps determine if an independent variable significantly contributes to the model.

The significance level (p-value) should be below 0.05 for a variable to be considered meaningful in most cases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is heteroskedasticity?

A

Occurs when the variance of the error terms u in a regression model is not constant. So the “errors” (mistakes) in your regression model don’t have a consistent spread (their variability changes across observations).

Heteroskedasticity doesn’t bias the regression coefficients but it makes standard errors and hypothesis testing unreliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can heteroskedasticity be addressed?

A
  • Robust Standard Errors: Adjusts the standard errors to account for heteroskedasticity.
  • Weighted Least Squares (WLS): Reweights observations to stabilize variance.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is auto-correlated errors?

A

Autocorrelated errors occur when the errors (residuals) in a regression model are not independent but instead show a pattern or relationship over time. This violates one of the key assumptions in regression analysis, leading to unreliable results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you solve auto-correlated errors? (2)

A
  • Adjust your model to directly address the source of autocorrelation (e.g., include lagged terms, past values or leads)
  • Use robust standard errors (like Newey-West) to correct for the issues in residuals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is multicollinearity?

A

Multicollinearity doesn’t violate the assumption of “no perfect linear dependence” (as long as predictors aren’t perfectly collinear), but it still causes numerical issues in estimating coefficients.

Large standard errors due to multicollinearity make it hard to determine the true effect of each variable, leading to unstable regression results.

Multicollinearity can be measured through VIF, Variance Inflation Factor. High VIF indicates severe multicollinearity and inflated standard errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How can you solve multicollinearity?

A
  • increase sample size N (this increases SST)
  • remove or combine highly correlated variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are irrelevant variables?

A

An over specified model occurs when there are irrelevant variables included in the model. Irrelevant variable is named z.

Including irrelevant variables (z) does not introduce bias in the coefficient estimates (β). However, it increases variance in the estimates due to sampling error, making the model less efficient.

Over-specifying the model adds unnecessary noise, which can affect the reliability and interpretability of the results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is omitted variables?

A

When the error term u is not purely random noise; it contains an omitted variable z, which creates bias.

Omitted variables creates:

Bias in Coefficients:

Omitting a relevant variable z introduces bias in β^ because the effect of z is wrongly attributed to X.

The bias increases if z is strongly correlated with X or has a large γ (strong effect on y).

In contrast to over-specified models (where coefficients remain unbiased but less efficient), under-specified models produce biased estimates.

Practical Impact:

Omitting relevant variables can severely distort conclusions from the model, leading to incorrect inferences about the relationship between X and y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the difference between sampling error and omitted variables bias?

A

Sampling error diminishes when sample size N increase. Not the same for omitted variable bias because the bias is systematic and stems from the structure of the model itself caused by leaving out a relevant variable; the omitted variable is correlated with X. X picks up the effects of the omitted variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How can you treat outliers?

A
  • Transformation:

Apply mathematical transformations to reduce the influence of extreme values. Example: Use the natural logarithm to compress large values and spread smaller ones.

  • Trimming:

Remove extreme values (e.g., top and bottom 5% of the dataset) from the analysis.

  • Winsorizing:

Replace extreme values with the nearest non-outlier value. Example: Cap values at the 95th percentile or floor them at the 5th percentile.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the constant elasticity model?

A

constant elasticity model is a type of regression model where the relationship between the dependent variable and the independent variable(s) exhibits a constant percentage change (elasticity).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is ordinary least squares?

A

chooses the estimates to minimize the SSR.

Each slope estimate measures the partial effect of the corresponding independent variable on the dependent variable, holding all other independent variables fixed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does no perfect collinearity mean?

A

In the sample (and therefore in the population), none of the independent variables is constant, and there are no exact linear relationships among the independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is perfect collinearity? And how is it different from multicollinearity?

A

Perfect Collinearity: X1 = X2
The two independent variables are indistinguishable in the model. Correlation = 1
Regression cannot be estimated.

Multicollinearity: X1≈ X2
The independent variables are highly similar but not identical, correlation is close to, but not 1.
Regression can be estimated, but coefficients are unreliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is zero conditional mean?

A

the error u has an expected value of zero given any values of the independent variables. when this assumption holds, we often say that we have exogenous explanatory variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is micronumerosity?

A

Problems of small sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is BLUE?

A

best linenar unbiased estimator. Under the Gauss-Markov assumptions the OLS estimators are the best linear unbiased estimators (BLUEs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the error variance?

A

The error variance measures the “noise” or randomness in the model. Higher error variance means more variability in the dependent variable Y that cannot be explained by the independent variables X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the normality assumption?

A

states that the error terms u are normally distributed. This assumption is particularly important for conducting hypothesis tests and creating confidence intervals for the regression coefficients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a dummy variable?

A

dummy variable is defined to distinguish between two groups. dummy variables are useful for incorporating ordinal information in regression models. we simply define a set of dummy variables representing different outcomes of the ordinal variable, allowing one of the categories to be the base group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the dummy variable trap?

A

arises when too many dummy variables describe a given number of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is an ordinal variable?

A

an ordinal variable is a type of categorical variable where the categories have a natural, meaningful order or ranking, but the differences between the ranks are not necessarily equal or measurable (bra, bättre bäst)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the self-selection problem?

A

the problem of participation decisions differing systematically by individual characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How can you identify heteroskedasticity?

A

to detect the presence of heteroskedasticity you use a White Test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the sample size in time series dataset determined by?

A

the sample size is determined by the number of time periods for which we observe the variables of interest (e.g., 30 daily observations of stock prices = sample size of 30).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is the spurious regression problem?

A

happens when you analyze two unrelated data sets that both have trends (non-stationary), and the regression falsely shows a strong relationship. This misleading result happens because the trends overlap, not because there’s any real connection. Ex: If you compare ice cream sales to sea level rise, both might increase over time, but they have no real link. The regression might say they are connected just because they both trend upward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is the Durbin-Watson statistic?

A

a test for autocorrelation. it checks whether the errors in your model are independent over time.

“D for Durbin-Watson, D for Detecting Dependence.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is the Breusch-Godfrey test?

A

is a statistical test used to detect serial correlation (autocorrelation) in the residuals of a regression model. Unlike the Durbin-Watson test, which is limited to detecting first-order autocorrelation, the Breusch-Godfrey test can detect higher-order autocorrelation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

When is weighted least squares used?

A

when the assumption of homoskedasticity is violated. WLS assigns weights to each observation to give less influence to observations with higher variance and more influence to those with lower variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is panel data regression?

A

Panel data regression is a method used to analyze data that varies across two dimensions: entities (e.g., individuals, firms, countries) and time periods (e.g., years, quarters). This type of data allows us to control for unobserved characteristics that are constant within entities or time periods, making the analysis more robust.

Panel data is data where you observe the same entities (e.g., people, companies, countries) over multiple time periods. Example: Tracking the income of individuals over 5 years.

Panel data has two dimensions:

Entities (cross-sectional dimension): Different individuals, firms, or countries.

Time (time-series dimension): Multiple observations over time for each entity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is pooled OLS?

A

Pooled OLS simplifies panel data analysis by ignoring the panel structure, but it requires strong assumptions. Pooled OLS simplifies the model by ignoring the entity-specific and time-specific effects.

Pooled OLS is appropriate when entity-specific and time-specific effects are uncorrelated with the independent variables.

It assumes:

  • No Correlation Between Covariates and Entity-Specific Effects
  • No Correlation Between Covariates and Time-Specific Effects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is first differencing?

A

First-differencing is a way to clean up your data by removing things that don’t change over time. First-differencing removes anything that stays the same over time (e.g., intelligence, personality), so you can focus on how changes in your independent variable (e.g., education) impact changes in your dependent variable (e.g., salary).

“first difference frees constant”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is LSDV?

A

Least Squares Dummy Variables. LSDV is a panel data regression approach that explicitly includes dummy variables to control for unobserved factors that vary:

  • Across entities (cross-sectional effects).
  • Across time (time-specific effects).

is the same as de-meaning that is done for the fixed effects regression. LSDV is basically the same as fixed effect but with more output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is a natural experiment`?

A

natural experiment occurs when some exogenous event—often a change in government policy—changes the environment in which individuals, families, firms, or cities operate. A natural experiment always has a control group, which is not affected by the policy change, and a treatment group, which is thought to be affected by the policy change. Unlike a true experiment, in which treatment and control groups are randomly and explicitly chosen, the control and treatment groups in natural experiments arise from the particular policy change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is parallel trends assumption?

A

assume that average
health trends would be the same for the low-income and middle-income families in the absence of the intervention.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is parallel trends assumption?

A

assume that average
health trends would be the same for the low-income and middle-income families in the absence of the intervention

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

When is the random effects estimator appropriate?

A

when the unobserved effect is thought to be uncorrelated with all the explanatory variables; Random Effects assumes exogeneity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is a cluster sample?

A

A cluster sample has the same appearance as a cross-sectional data set, but there is an important difference: clusters of units are sampled from a population of clusters rather than sampling individuals from the population of individuals. In the previous examples, each family is sampled from the population of families, and then we obtain data on at least two family members. Therefore, each family is a cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What are different analyses used for panel data?

A

Panel data combines both the cross-sectional and time series elements

  1. cross-sectional data format: 1 time period but several entities (horizontal)
  2. time series data
    1 entity, several time periods (vertical)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What is a balanced panel?

A

Equal number of observations for each firm for the entire period. (every firm has 3 years of observations)

When there are no missing observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is an unbalanced panel?

A

Number of observations are NOT the same for all firms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What is the residual error term called in panel data regression?

A

epsilon. its a time-invariant component in the error term. (time invariant = does not change over time, stays the same)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

A common way of estimating a panel regression is by using the fixed-effects regression. What is another name of it?

A

Within regression

“fixed friends stay within”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Why is the standard OLS regression a bad model for panel data?

A

It does not take into account the panel structure of the data. It fails because because it does not take into account unobserved time-invariant heterogeneity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

How is FE and FD similar?

A

both can deal with the problemn of eta(i); the time-invariant individual-specific component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What is another word for panel data?

A

Longitudinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What is cross-sectional data?

A

Observations of the subjects are obtained at the same point in time

59
Q

What is time-series data?

A

Observations are generated over time

60
Q

What is time-series regression?

A

Estimate a time-series model for each firm using OLS

But we end up with disparate pieces of information, which would not enable a comprehensive asssessment on how X1 and X2 jointly affects Y

Also ignores information about other firms operating in the same environment. Serial correlation might be a problem because of time-dependent nature of Y

61
Q

What is an event study?

A

an event study typically tries to examine return behavior for a sample of firms experiencing a common type of event. examines the behavior of firms stock prices around corporate events. event studies focusing on announcement effects for a short-horizon around an event provide evidence relevant for understanding corporate policy decisions. vent studies are joint tests of market efficiency and a model of expected returns

62
Q

How does event studies test market efficiency?

A

event study also test market efficiency since systematically nonzero abnormal security returns that persist after a particular type of corporate event are inconsistent with market efficiency. event studies focusing on long-horizons following an event can provide key evidence on market efficiency. examination of post-event returns provides information on market efficiency

63
Q

What is a model of normal returns?

A

must be specified before an abnormal return can be defined. a variety of exptected return models have been used in event studies, like the capital asset pricing model or constant mean return model. the approaches can be grouped into two categories: statistical and economics. ex:

Statistical

  1. Constant Mean Return Model
  2. Market Model
  3. Factor Model

Economic

  1. Capital Asset Pricing Model
64
Q

What is the Type I error?

A

occurs when the null hypothesis is falsely rejected

65
Q

What is type II error?

A

occurs when the null is falsely accepted

66
Q

What is the joint test problem?

A

all tests are joint tests. that is, event study tests are well-specified only to the extent that the assumptions underlying their estimation are correct. this poses a significant challenge because event study tests are joint tests of whether abnormal returns are zero and of whether the assumed model of expected returns, CAPM etc, is correct

67
Q

How is Pooled OLS a Panel Data method?

A

Pooled OLS is the simplest way to estimate a model with panel data. It pools all observations across time and individuals, treating them as a single dataset

68
Q

What is First Differencing?

A

First-differencing focuses on changes between periods, while time-demeaning focuses on deviations from the individual’s average.

69
Q

What is autocorrelation?

A

Autocorrelation means that the error terms / residuals in a model are related to each other over time. In simple terms, it’s when past errors influence current errors, creating a pattern instead of randomness. This can lead to biased or inefficient results in regression analysis.

70
Q

What is LSDV?

A

Adds dummy variables for each individual to directly account for unobserved, time-invariant factors (fixed effects)

71
Q

What is Time-Demeaning Fixed Effects?

A

Removes individual averages (demeaning) to account for unobserved, time-invariant factors without adding dummy variables.

72
Q

What is random effects?

A

Assumes unobserved factors are random and uncorrelated with explanatory variables, making it more efficient but less robust if the assumption is wrong.

73
Q

How do you decide between Random Effect or Fixed Effect?

A

Hausman test

74
Q

How to handle autocorrelation?

A

Check if the model is properly specified. Consider adding lagged variables (past values) or leads (future values) to make it dynamic.

75
Q

How to handle contemporaneous correlation?

A

A contemporaneous correlation refers to the relationship between two variables or residuals measured at the same point in time. It indicates how strongly the two variables are associated during the same time period, without considering time lags or leads. If errors across units are related, try a two-way model to account for this.

76
Q

How to handle HAC standard errors?

A

Use robust standard errors (Heteroskedasticity and Autocorrelation Consistent) to adjust for these issues.

77
Q

What is Calendar time portfolio?

A

Rather than stacking all events into ‘event time’, we may form so-called a calendar-time portfolio

All stocks that has experienced an event in a given time-period are included into a portfolio

78
Q

How is portfolio performance evaluated?

A

By using Jensens Alpha

79
Q

How does Jensens alpha work?

A

Jensen’s alpha assumes only beta risk matters (1-factor model), and any non-zero alpha is an anomaly (challenges market efficiency).

80
Q

How does Jensens alpha lead to the joint hypothesis problem?

A

Jensen’s alpha assumes only beta risk matters, as Efficient Market Hypothesis says, (1-factor model), and any non-zero alpha is an anomaly and that challenges the market efficiency.

This leads to the joint hypothesis problem: Is the issue with market efficiency, the asset-pricing model, or both? It’s unclear.

81
Q

Does a cross-sectional regression allow for a hedge portfolio?

82
Q

Does a cross-sectional regression allow for multivariate regression?

83
Q

Portfolio sorts is difficult to apply with more than two factors and cross-sectional regression (which do allow for multivariate regression) may suffer from omitted variables. What is the solution?

A
  1. To allow for panel fixed effects and perhaps apply a two-way-fixed effect event study
84
Q

What is an event study?

A

A quasi experiment where the event is the treatment

85
Q

How is the research design in an event study?

A

The research design isolates the treatment effect from other confounding factors

86
Q

Is the treatment always permanent?

A

No, it can be temporary

87
Q

Is an event study transparent and replicable?

88
Q

When is an event study particularly useful?

A

When Randomized Control Trials cannot be used

89
Q

What is a staggered rollout design?

A

Events occur at different calendar times

90
Q

What is a non-staggered rollout design?

A

All events occur at the same calendar time

91
Q

How can you predict the counterfactual (what would have happened without a treatment or event)

A
  1. Ignore it (assumes a random walk)
  2. Use pre-event data to identify trends
  3. Combine pre-event and post-event data
92
Q

What does the regression-based event study method look like?

A

Its a pooled OLS with dummies.

Can be:
One-way fixed effect with dummies or two-way fixed effect with dummies

93
Q

What are the dummies like for a regression based event study method?

A

Typically binary (on/off)
Typically permanent, (always “on” after an event)

94
Q

How can you simplify the two-way dynamic fixed effect design?

A

Simpler models can ignore fixed effects OR use fewer dummies

  • a single treatment dummy
  • separate dummies for different time windows
95
Q

What are some common research design issues in event study?

A

Cohort changes: Pre- and post-event groups may differ.

Anticipation: Participants might know about the treatment before it happens, affecting the pre-event window.

Parallel-trend assumption: Assumes trends in treatment and control groups would have been the same without treatment.

Homogeneous treatment effect: Assumes the treatment effect is the same across all groups (cross-sections).

96
Q

Which are the two versions of an event study research design?

A
  1. A pure event study
  2. Portfolio sorts
97
Q

What does CAR measure?

A

the total abnormal returns over a period

98
Q

What is the counterfactual?

A

Represents the expected return without the event.

Calculated using both pre-event and post-event data to predict what would have occurred.

99
Q

Does event studies assume that the market is efficient?

100
Q

What does market efficiency assume about share prices?

A

share prices reflect all available information.

101
Q

What happens to share prices if information is released randomly?

A

If information is released randomly, share prices should follow a random walk, with no predictable gains (zero-NPV transactions).

102
Q

What does it mean if shares behave differently after new information?

A

Its a nonrandom performance: it indicates the event impacts security values.

103
Q

What does the constant return model assume?

A

That the mean return for a share is constant

104
Q

Is an equally weighted index or value-weighted index preferred?

A

An equally weighted index

105
Q

How does the market model reduce the variance of abnormal returns?

A

It removes the portion of returns related to market variations

106
Q

Is lower or higher variance better in a market model?

A

Lower. It improves the ability to detect event effects. The benefit depends on the models R^2. Higher R^2 means greater variance reduction and better accuracy

107
Q

What does the market-adjusted model ignore?

A

Risk factors

108
Q

When does the market-adjusted model work well?

A

For short-window event studies.

109
Q

How does the market-adjusted model work?

A

It simplifies analysis by directly comparing stock returns to market returns.

110
Q

Which 3 factors are included in the Fama and French 3 factor model?

A
  1. Market returns
  2. Size factor
  3. Value factor
111
Q

Adding factors reduces abnormal return variance only slightly because…

A

their explanatory power is marginal

112
Q

When does the Fama and French 3 factor model work best?

A

when firms share common traits (e.g., same industry or market capitalization group).

113
Q

Sometimes we have two factors that affect returns. Which can it be?

A

Size and B/M (book to market)

114
Q

What’s the difference between sorts and Fama French 3 factor model?

A

Fama-French 3-factor model show larger abnormal returns than sorts.

115
Q

What is Jensens alpha?

A

Jensen’s alpha measures the portion of returns that cannot be explained by the model (i.e., by beta alone)

116
Q

If the alpha is significantly non-zero, what does that mean?

A

it implies the model fails to fully explain the observed returns, suggesting the presence of “anomalies.”

117
Q

A non-zero alpha could mean either what?

A
  1. The model is incorrect (e.g., missing factors that influence returns).
  2. Markets are not efficient.

This dual interpretation makes it hard to pinpoint the root cause of anomalies.

118
Q

Why is it hard to pinpoint the root cause of anomalies?

A

It could be because of two reasons:

  1. The model is incorrect (e.g., missing factors that influence returns).
  2. Markets are not efficient.
  3. or both?
119
Q

What does zero alpha mean?

A

The model (e.g., CAPM) perfectly explains the observed returns, meaning the asset’s performance aligns entirely with what the model predicts based on its beta (systematic risk).

120
Q

What does non-zero alpha mean?

A

Alpha > 0 (positive alpha) the asset is performing better than the model predicts
or
Alpha < 0 (negative alpha) the asset is performing worse than the model predict

121
Q

What could be the cause of positive alpha?

A

it could mean an asset has some characteristics or advantages that the model is not accounting for (omitted variable)

122
Q

What could be the cause negative alpha?

A

It might indicate risks or disadvantages overlooked by the model

123
Q

What is the difference between CAPM and APT?

A

CAPM: single factor model
APT: multi factor model

124
Q

Does APT assume that the market is efficient?

125
Q

The outcome is calculated as what?

A

The spread between the treatment outcome and the expected outcome, which is measured as above

126
Q

What is Book-to-Market ratio?

A

It compares a company book to its market value.

Its particularly in asset pricing models, to identify undervalued or overvalued stocks and assess a firm’s valuation.

127
Q

Stocks with a high BM ratio tends to outperform those with

A

A low BM ratio, over the long term

128
Q

What factors are used in the Fama French 3 Factor model?

A
  1. Market risk (beta)
  2. Size of the firm
  3. Book-to-Market value
129
Q

What are portfolio sorts?

A

used in finance to evaluate the relationship between stock characteristics (e.g., size, value, momentum) and returns by grouping stocks into portfolios based on certain characteristics or factors

130
Q

What is Cross-sectional regression?

A

a method where you regress the returns of multiple assets at a single point in time on their characteristics or factors. It helps identify the relationship between asset returns and certain variables (e.g., beta, size, value) in the cross-section of stocks.

131
Q

What is the biggest difference between portfolio sorts and cross-sectional regression?

A

Portfolio sorts allows for a NON LINEAR relationship, cross-sectional does not.

132
Q

What is Two-Way-Fixed Effect?

A

is a regression framework commonly used in panel data analysis. It accounts for unobserved heterogeneity across two dimensions, such as time and entities (e.g., individuals, firms, or countries). By doing so, it controls for factors that vary along these dimensions but remain constant within them

133
Q

Can cross-sectional regressions suffer from omitted variables?

134
Q

What is the sorting process?

A

a common methodology in financial and asset pricing research where stocks (or firms) are grouped into portfolios based on specific characteristics or factors. This process helps analyze the relationship between these factors and stock returns.

135
Q

What is the difference between Two-Way-Fixed Effect and Fixed Effect?

A

While both deal with unobserved heterogeneity, TWFE includes fixed effects for two dimensions, whereas a fixed effects model (often called “one-way fixed effects”) includes fixed effects for only one dimension.

Use one-way fixed effects if you suspect heterogeneity across entities but are not concerned about time effects.
Use two-way fixed effects if you suspect heterogeneity across entities and over time.

136
Q

What is autocorrelation?

A

Autocorrelation occurs when the residuals (errors) of a regression model are correlated with each other.

137
Q

What is fixed effects?

A

Fixed effect is the same as ”within regression”. Measures what happens within firms over time. Addresses unobserved heterogeneity as dummy variables.

138
Q

What is random effects?

A

So fixed effects measure within firms over time. Random effects measures within and between. Handles unobserved heterogeneity as a part of the error term u –> exogeneity

139
Q

What is unobserved heterogeneity?

A

Unobserved heterogeneity refers to differences across individuals, groups, or entities in a dataset that are not directly measured or observed, but still affect the outcome variable in a model. Like personality.

These unmeasured factors can lead to bias or inaccurate estimates if they are correlated with the independent variables in your analysis.

140
Q

What is some characteristics of dummies?

A

Dummies are often binary, but they can be continous too.

Dummies are often permanent (switch is turned on), but can be non-permanent too (for the duration of the
post-event window)

141
Q

What is sorting bias in the sample selection criteria?

A

There is a risk of selection bias when choosing a sample based on specific characteristics (e.g., market cap, book-to-market ratio). It means that focusing only on certain types of firms or stocks can distort the results because the sample may not represent the entire population. This can lead to inaccurate or misleading conclusions, as the analysis may overemphasize the selected traits while ignoring broader patterns or variations.

142
Q

Portfolio returns are calculated and evaluated using…?

A

Jensens alpha

143
Q

The calendar time portfolio is updated periodically by doing three steps. Which?

A
  • Re-weighting the stocks.
  • Removing stocks if the event no longer applies.
  • Adding new stocks that experienced the event in the current period.