Regression, event, panel, portfolio Flashcards

Question

What is perfect collinearity? And how is it different from multicollinearity?

Answer 1

Perfect Collinearity: X1 = X2 The two independent variables are indistinguishable in the model. Correlation = 1 Regression cannot be estimated. Multicollinearity: X1≈ X2 The independent variables are highly similar but not identical, correlation is close to, but not 1. Regression can be estimated, but coefficients are unreliable.

Answer 2

the error u has an expected value of zero given any values of the independent variables. when this assumption holds, we often say that we have exogenous explanatory variables

Answer 3

Problems of small sample size

Answer 4

best linenar unbiased estimator. Under the Gauss-Markov assumptions the OLS estimators are the best linear unbiased estimators (BLUEs

Answer 5

The error variance measures the "noise" or randomness in the model. Higher error variance means more variability in the dependent variable Y that cannot be explained by the independent variables X

Answer 6

states that the error terms u are normally distributed. This assumption is particularly important for conducting hypothesis tests and creating confidence intervals for the regression coefficients

Answer 7

dummy variable is defined to distinguish between two groups. dummy variables are useful for incorporating ordinal information in regression models. we simply define a set of dummy variables representing different outcomes of the ordinal variable, allowing one of the categories to be the base group

Answer 8

arises when too many dummy variables describe a given number of groups

Answer 9

an ordinal variable is a type of categorical variable where the categories have a natural, meaningful order or ranking, but the differences between the ranks are not necessarily equal or measurable (bra, bättre bäst)

Answer 10

the problem of participation decisions differing systematically by individual characteristics

Answer 11

to detect the presence of heteroskedasticity you use a White Test.

Answer 12

the sample size is determined by the number of time periods for which we observe the variables of interest (e.g., 30 daily observations of stock prices = sample size of 30).

Answer 13

happens when you analyze two unrelated data sets that both have trends (non-stationary), and the regression falsely shows a strong relationship. This misleading result happens because the trends overlap, not because there's any real connection. Ex: If you compare ice cream sales to sea level rise, both might increase over time, but they have no real link. The regression might say they are connected just because they both trend upward

Answer 14

a test for autocorrelation. it checks whether the errors in your model are independent over time. "D for Durbin-Watson, D for Detecting Dependence."

Answer 15

is a statistical test used to detect serial correlation (autocorrelation) in the residuals of a regression model. Unlike the Durbin-Watson test, which is limited to detecting first-order autocorrelation, the Breusch-Godfrey test can detect higher-order autocorrelation

Answer 16

when the assumption of homoskedasticity is violated. WLS assigns weights to each observation to give less influence to observations with higher variance and more influence to those with lower variance

Answer 17

Panel data regression is a method used to analyze data that varies across two dimensions: entities (e.g., individuals, firms, countries) and time periods (e.g., years, quarters). This type of data allows us to control for unobserved characteristics that are constant within entities or time periods, making the analysis more robust. Panel data is data where you observe the same entities (e.g., people, companies, countries) over multiple time periods. Example: Tracking the income of individuals over 5 years. Panel data has two dimensions: Entities (cross-sectional dimension): Different individuals, firms, or countries. Time (time-series dimension): Multiple observations over time for each entity.

Answer 18

Pooled OLS simplifies panel data analysis by ignoring the panel structure, but it requires strong assumptions. Pooled OLS simplifies the model by ignoring the entity-specific and time-specific effects. Pooled OLS is appropriate when entity-specific and time-specific effects are uncorrelated with the independent variables. It assumes: - No Correlation Between Covariates and Entity-Specific Effects - No Correlation Between Covariates and Time-Specific Effects

Answer 19

First-differencing is a way to clean up your data by removing things that don’t change over time. First-differencing removes anything that stays the same over time (e.g., intelligence, personality), so you can focus on how changes in your independent variable (e.g., education) impact changes in your dependent variable (e.g., salary). "first difference frees constant"

Answer 20

Least Squares Dummy Variables. LSDV is a panel data regression approach that explicitly includes dummy variables to control for unobserved factors that vary: - Across entities (cross-sectional effects). - Across time (time-specific effects). is the same as de-meaning that is done for the fixed effects regression. LSDV is basically the same as fixed effect but with more output

Answer 21

natural experiment occurs when some exogenous event—often a change in government policy—changes the environment in which individuals, families, firms, or cities operate. A natural experiment always has a control group, which is not affected by the policy change, and a treatment group, which is thought to be affected by the policy change. Unlike a true experiment, in which treatment and control groups are randomly and explicitly chosen, the control and treatment groups in natural experiments arise from the particular policy change.

Answer 22

assume that average health trends would be the same for the low-income and middle-income families in the absence of the intervention.

Answer 23

assume that average health trends would be the same for the low-income and middle-income families in the absence of the intervention

Answer 24

when the unobserved effect is thought to be uncorrelated with all the explanatory variables; Random Effects assumes exogeneity

Answer 25

A cluster sample has the same appearance as a cross-sectional data set, but there is an important difference: clusters of units are sampled from a population of clusters rather than sampling individuals from the population of individuals. In the previous examples, each family is sampled from the population of families, and then we obtain data on at least two family members. Therefore, each family is a cluster

Answer 26

Panel data combines both the cross-sectional and time series elements 1. cross-sectional data format: 1 time period but several entities (horizontal) 2. time series data 1 entity, several time periods (vertical)

Answer 27

Equal number of observations for each firm for the entire period. (every firm has 3 years of observations) When there are no missing observations.

Answer 28

Number of observations are NOT the same for all firms

Answer 29

epsilon. its a time-invariant component in the error term. (time invariant = does not change over time, stays the same)

Answer 30

Within regression "fixed friends stay within"

Answer 31

It does not take into account the panel structure of the data. It fails because because it does not take into account unobserved time-invariant heterogeneity

Answer 32

both can deal with the problemn of eta(i); the time-invariant individual-specific component.

Answer 33

Longitudinal data

Answer 34

Observations of the subjects are obtained at the same point in time

Answer 35

Observations are generated over time

Answer 36

Estimate a time-series model for each firm using OLS But we end up with disparate pieces of information, which would not enable a comprehensive asssessment on how X1 and X2 jointly affects Y Also ignores information about other firms operating in the same environment. Serial correlation might be a problem because of time-dependent nature of Y

Answer 37

an event study typically tries to examine return behavior for a sample of firms experiencing a common type of event. examines the behavior of firms stock prices around corporate events. event studies focusing on announcement effects for a short-horizon around an event provide evidence relevant for understanding corporate policy decisions. vent studies are joint tests of market efficiency and a model of expected returns

Answer 38

event study also test market efficiency since systematically nonzero abnormal security returns that persist after a particular type of corporate event are inconsistent with market efficiency. event studies focusing on long-horizons following an event can provide key evidence on market efficiency. examination of post-event returns provides information on market efficiency

Answer 39

must be specified before an abnormal return can be defined. a variety of exptected return models have been used in event studies, like the capital asset pricing model or constant mean return model. the approaches can be grouped into two categories: statistical and economics. ex: Statistical 1. Constant Mean Return Model 2. Market Model 3. Factor Model Economic 1. Capital Asset Pricing Model

Answer 40

occurs when the null hypothesis is falsely rejected

Answer 41

occurs when the null is falsely accepted

Answer 42

all tests are joint tests. that is, event study tests are well-specified only to the extent that the assumptions underlying their estimation are correct. this poses a significant challenge because event study tests are joint tests of whether abnormal returns are zero and of whether the assumed model of expected returns, CAPM etc, is correct

Answer 43

Pooled OLS is the simplest way to estimate a model with panel data. It pools all observations across time and individuals, treating them as a single dataset

Answer 44

First-differencing focuses on changes between periods, while time-demeaning focuses on deviations from the individual’s average.

Answer 45

Autocorrelation means that the error terms / residuals in a model are related to each other over time. In simple terms, it’s when past errors influence current errors, creating a pattern instead of randomness. This can lead to biased or inefficient results in regression analysis.

Answer 46

Adds dummy variables for each individual to directly account for unobserved, time-invariant factors (fixed effects)

Answer 47

Removes individual averages (demeaning) to account for unobserved, time-invariant factors without adding dummy variables.

Answer 48

Assumes unobserved factors are random and uncorrelated with explanatory variables, making it more efficient but less robust if the assumption is wrong.

Answer 49

Hausman test

Answer 50

Check if the model is properly specified. Consider adding lagged variables (past values) or leads (future values) to make it dynamic.

Answer 51

A contemporaneous correlation refers to the relationship between two variables or residuals measured at the same point in time. It indicates how strongly the two variables are associated during the same time period, without considering time lags or leads. If errors across units are related, try a two-way model to account for this.

Answer 52

Use robust standard errors (Heteroskedasticity and Autocorrelation Consistent) to adjust for these issues.

Answer 53

Rather than stacking all events into ‘event time’, we may form so-called a calendar-time portfolio All stocks that has experienced an event in a given time-period are included into a portfolio

Answer 54

By using Jensens Alpha

Answer 55

Jensen’s alpha assumes only beta risk matters (1-factor model), and any non-zero alpha is an anomaly (challenges market efficiency).

Answer 56

Jensen’s alpha assumes only beta risk matters, as Efficient Market Hypothesis says, (1-factor model), and any non-zero alpha is an anomaly and that challenges the market efficiency. This leads to the joint hypothesis problem: Is the issue with market efficiency, the asset-pricing model, or both? It’s unclear.

Answer 57

1. To allow for panel fixed effects and perhaps apply a two-way-fixed effect event study

Answer 58

A quasi experiment where the event is the treatment

Answer 59

The research design isolates the treatment effect from other confounding factors

Answer 60

No, it can be temporary

Answer 61

When Randomized Control Trials cannot be used

Answer 62

Events occur at different calendar times

Answer 63

All events occur at the same calendar time

Answer 64

1. Ignore it (assumes a random walk) 2. Use pre-event data to identify trends 3. Combine pre-event and post-event data

Answer 65

Its a pooled OLS with dummies. Can be: One-way fixed effect with dummies or two-way fixed effect with dummies

Answer 66

Typically binary (on/off) Typically permanent, (always "on" after an event)

Answer 67

Simpler models can ignore fixed effects OR use fewer dummies - a single treatment dummy - separate dummies for different time windows

Answer 68

Cohort changes: Pre- and post-event groups may differ. Anticipation: Participants might know about the treatment before it happens, affecting the pre-event window. Parallel-trend assumption: Assumes trends in treatment and control groups would have been the same without treatment. Homogeneous treatment effect: Assumes the treatment effect is the same across all groups (cross-sections).

Answer 69

1. A pure event study 2. Portfolio sorts

Answer 70

the total abnormal returns over a period

Answer 71

Represents the expected return without the event. Calculated using both pre-event and post-event data to predict what would have occurred.

Answer 72

share prices reflect all available information.

Answer 73

If information is released randomly, share prices should follow a random walk, with no predictable gains (zero-NPV transactions).

Answer 74

Its a nonrandom performance: it indicates the event impacts security values.

Answer 75

That the mean return for a share is constant

Answer 76

An equally weighted index

Answer 77

It removes the portion of returns related to market variations

Answer 78

Lower. It improves the ability to detect event effects. The benefit depends on the models R^2. Higher R^2 means greater variance reduction and better accuracy

Answer 79

Risk factors

Answer 80

For short-window event studies.

Answer 81

It simplifies analysis by directly comparing stock returns to market returns.

Answer 82

1. Market returns 2. Size factor 3. Value factor

Answer 83

their explanatory power is marginal

Answer 84

when firms share common traits (e.g., same industry or market capitalization group).

Answer 85

Size and B/M (book to market)

Answer 86

Fama-French 3-factor model show larger abnormal returns than sorts.

Answer 87

Jensen’s alpha measures the portion of returns that cannot be explained by the model (i.e., by beta alone)

Answer 88

it implies the model fails to fully explain the observed returns, suggesting the presence of "anomalies."

Answer 89

1. The model is incorrect (e.g., missing factors that influence returns). 2. Markets are not efficient. This dual interpretation makes it hard to pinpoint the root cause of anomalies.

Answer 90

It could be because of two reasons: 1. The model is incorrect (e.g., missing factors that influence returns). 2. Markets are not efficient. 3. or both?

Answer 91

The model (e.g., CAPM) perfectly explains the observed returns, meaning the asset's performance aligns entirely with what the model predicts based on its beta (systematic risk).

Answer 92

Alpha > 0 (positive alpha) the asset is performing better than the model predicts or Alpha < 0 (negative alpha) the asset is performing worse than the model predict

Answer 93

it could mean an asset has some characteristics or advantages that the model is not accounting for (omitted variable)

Answer 94

It might indicate risks or disadvantages overlooked by the model

Answer 95

CAPM: single factor model APT: multi factor model

Answer 96

The spread between the treatment outcome and the expected outcome, which is measured as above

Answer 97

It compares a company book to its market value. Its particularly in asset pricing models, to identify undervalued or overvalued stocks and assess a firm's valuation.

Answer 98

A low BM ratio, over the long term

Answer 99

1. Market risk (beta) 2. Size of the firm 3. Book-to-Market value

Answer 100

used in finance to evaluate the relationship between stock characteristics (e.g., size, value, momentum) and returns by grouping stocks into portfolios based on certain characteristics or factors

Answer 101

a method where you regress the returns of multiple assets at a single point in time on their characteristics or factors. It helps identify the relationship between asset returns and certain variables (e.g., beta, size, value) in the cross-section of stocks.

Answer 102

Portfolio sorts allows for a NON LINEAR relationship, cross-sectional does not.

Answer 103

is a regression framework commonly used in panel data analysis. It accounts for unobserved heterogeneity across two dimensions, such as time and entities (e.g., individuals, firms, or countries). By doing so, it controls for factors that vary along these dimensions but remain constant within them

Answer 104

a common methodology in financial and asset pricing research where stocks (or firms) are grouped into portfolios based on specific characteristics or factors. This process helps analyze the relationship between these factors and stock returns.

Answer 105

While both deal with unobserved heterogeneity, TWFE includes fixed effects for two dimensions, whereas a fixed effects model (often called "one-way fixed effects") includes fixed effects for only one dimension. Use one-way fixed effects if you suspect heterogeneity across entities but are not concerned about time effects. Use two-way fixed effects if you suspect heterogeneity across entities and over time.

Answer 106

Autocorrelation occurs when the residuals (errors) of a regression model are correlated with each other.

Answer 107

Fixed effect is the same as ”within regression”. Measures what happens within firms over time. Addresses unobserved heterogeneity as dummy variables.

Answer 108

So fixed effects measure within firms over time. Random effects measures within and between. Handles unobserved heterogeneity as a part of the error term u --> exogeneity

Answer 109

Unobserved heterogeneity refers to differences across individuals, groups, or entities in a dataset that are not directly measured or observed, but still affect the outcome variable in a model. Like personality. These unmeasured factors can lead to bias or inaccurate estimates if they are correlated with the independent variables in your analysis.

Answer 110

Dummies are **often binary**, but they can be continous too. Dummies are **often permanent** (switch is turned on), but can be non-permanent too (for the duration of the post-event window)

Answer 111

There is a risk of selection bias when choosing a sample based on specific characteristics (e.g., market cap, book-to-market ratio). It means that focusing only on certain types of firms or stocks can distort the results because the sample may not represent the entire population. This can lead to inaccurate or misleading conclusions, as the analysis may overemphasize the selected traits while ignoring broader patterns or variations.

Answer 112

Jensens alpha

Answer 113

* Re-weighting the stocks. * Removing stocks if the event no longer applies. * Adding new stocks that experienced the event in the current period.

Regression, event, panel, portfolio Flashcards

(143 cards)