EF tentafrågor Flashcards
Explain the difference between PROBIT and LOGIT models
The probit model uses the standard normal CDF to map x’B into probability, while the LOGIT models uses the logit function to do this.
Nikki states that the Yieldspread has a higher economic impact in the LOGIT model than in the PROBIT model, as the corresponding coefficients are equal to 0.31 and 0.17 respectively.
c) Explain why Nikki is wrong. What should Nikki then do to assess the economic impact of YLDSPREAD on the bullbear market variable in both models? (2 pt)
Since Probit and Logit models are non-linear models while at the same time they have a different link function, you can not compare the estimated coefficients directly. (1 pt) [important: different LINK function]
To compare the economic impact, Nikki should consider the marginal effects of both models. (1 pt)
How can we evaluate whether an estimated ARMA model is ok?
Several possibilities exist. Many \misspecication tests” aim
to test whether the residuals of the ARMA model satisfy
the white noise properties:
-Test of no residual autocorrelation
-Test of homoskedasticity (constant variance), often based
on autocorrelations of squared residuals
-Test of normality: Skewness = 0, Kurtosis = 3.
What is the difference bwtween a CAP and a ROC curve?
See slide 38. The CAP curve plots the proportion of data (j/N) against the hit rate, while the ROC curve uses the false alarm rate 𝐹𝑗/𝐹̅ against the hit-rate (1 pt)
You would like to investigate the Capital structure of 250 firms listed at the NYSE from 1970 – 2018.
As a dependent variable, you use the leverage (𝐿𝑖𝑡) whereas the independent variables consist of the logarithm of Size (ln𝑆𝑖𝑡), the logarithm of 1 + the age of the firm (ln(1+𝐴𝑔𝑒𝑖𝑡)), the Profits/sales ratio (𝑃𝑆𝑖𝑡), the asset tangibility (𝑇𝑎𝑛𝑖𝑡) and the R&D/Sales ratio (𝑅𝐷𝑆𝑖𝑡) with t in years. The 250 firms of the sample can be classified into 10 different industries.
b) What is the economic Interpretation of the coefficient corresponding to the ln Size variable? (1 pt)
If size of a firm at time t increases with 1%, then the leverage will change with 1/100 𝛽1 units (1 pt)
Note: I would like to see the relative change of x w.r.t. absolute effect of y!
You would like to investigate the Capital structure of 250 firms listed at the NYSE from 1970 – 2018.
As a dependent variable, you use the leverage (𝐿𝑖𝑡) whereas the independent variables consist of the logarithm of Size (ln𝑆𝑖𝑡), the logarithm of 1 + the age of the firm (ln(1+𝐴𝑔𝑒𝑖𝑡)), the Profits/sales ratio (𝑃𝑆𝑖𝑡), the asset tangibility (𝑇𝑎𝑛𝑖𝑡) and the R&D/Sales ratio (𝑅𝐷𝑆𝑖𝑡) with t in years. The 250 firms of the sample can be classified into 10 different industries.
a) Write down the pooled regression model for 𝐿𝑖𝑡 ? (1 pt)
b) What is the economic Interpretation of the coefficient corresponding to the ln Size variable? (1 pt)
𝐿𝑖𝑡=𝛼+ 𝛽1ln𝑆𝑖𝑡+𝛽2ln(1+𝐴𝑔𝑒𝑖𝑡)+ 𝛽3𝑃𝑆𝑖𝑡+𝛽4𝑇𝑎𝑛𝑖𝑡+ 𝛽5𝑅𝐷𝑆𝑖𝑡+ 𝜖𝑖𝑡
b)
If size of a firm at time t increases with 1%, then the leverage will change with 1/100 𝛽1 units (1 pt)
Note: I would like to see the relative change of x w.r.t. absolute effect of y!
You expect in the pooled regression model that for each firm the (in)dependent variable(s) do not change a lot through time.
d) Which standard OLS assumption will be violated? Also provide the EXACT solution to this problem? (2 pt)
The assumption that 𝐶𝑜𝑣(𝜖𝑖𝑡,𝜖𝑖,𝑡+1)≠0 (1 pt)
Exact solution is to use clustered standard errors, and cluster per firm. (1 pt)
Suppose that the relationship between leverage and the independent variables changes drastically after 2008 (the Global Financial Crisis).
e) Write down the complete procedure to test this. That is: write the equation(s) to be estimated, the H(0) and H(A) of the test, and the corresponding test statistic. Be as complete as possible! (3 pt)
This is the Chow break test
Define 𝐷𝑡 as a dummy that equals 1 after 2008 and zero elsewhere.
The model now reads:
𝐿𝑖𝑡=𝛼+ 𝛽1ln𝑆𝑖𝑡+𝛽2ln(1+𝐴𝑔𝑒𝑖𝑡)+ 𝛽3𝑃𝑆𝑖𝑡+𝛽4𝑇𝑎𝑛𝑖𝑡+ 𝛽5𝑅𝐷𝑆𝑖𝑡+ 𝛽6ln𝑆𝑖𝑡𝐷𝑡+𝛽7ln(1+𝐴𝑔𝑒𝑖𝑡)𝐷𝑡+ 𝛽8𝑃𝑆𝑖𝑡𝐷𝑡+𝛽9𝑇𝑎𝑛𝑖𝑡𝐷𝑡+ 𝛽10𝑅𝐷𝑆𝑖𝑡𝐷𝑡+ 𝛽11𝐷𝑡+ 𝜖𝑖𝑡
Now perform an F-test
H0: 𝛽𝑖 =0 for i = 6,…, 11.
HA: at least one 𝛽𝑖 (i = 6,…11) is non-zero. (equation and H0: 2 pt)
Procedure:
1) estimate the pooled regression model of a), and store 𝑆𝑆𝑅0.
2) Estimate the model above, and store 𝑆𝑆𝑅1
3) The F-stat is now given by
F(𝑘1−𝑘0,𝑇−𝑘1)= 𝑇−𝑘0𝑘1−𝑘0(𝑆𝑆𝑅0−𝑆𝑆𝑅1) 𝑆𝑆𝑅1
With T = 250 * 49, 𝑘1 = 12 and 𝑘0=6 . (test stat 1pt)
Note: it is also ok if you omit 𝛽11𝐷𝑡.
You regress leverage on the other variables (hence without any dummies) using a random effects specification, and obtain very different results from a fixed effects specification.
f) How do you interpret this result, and what will you do next? (2pt)
I conclude that the random effect might be correlated with the regressors and there might be an endogeneity concern. (1pt)
I could do a Hausman test to see whether this is actually the case, but for sure would trust the fixed effect result more than the random effects result. (1pt)
a) What is the crucial difference between CONDITIONAL and UNCONDITIONAL volatility? (1 pt)
a) What is the crucial difference between CONDITIONAL and UNCONDITIONAL volatility? (1 pt)
Unconditional volatility means that you do not take into account any information, while conditional volatility means that you consider the volatility, given some information (usually we use information up to one period earlier in time). (1 pt)
Note: the CONDITIONING is important, NOT if it is fixed through time or not.
Question 4 (Volatility modeling; 10 pt) Sanne would like to model the volatility of stock returns. She has daily stock returns (in percentages) of the Bank of America Inc. (BAC) from 2001 – 2014. In addition, she also has daily values of the VIX index. Figure 4.1 plots the daily returns.
Sanne hypothesizes that the VIX could be related to the volatility of BAC stock returns.
b) Explain why Sanne is possibly right? (1 pt)
The VIX is a forward looking indicator of the volatility of the SP500 Index. If the general stock market volatility goes up, the volatility of a big bank (BAC) will probably also go up. (1 pt)
Andre does not believe in the effect of the VIX on the variance of BAC returns and estimates a simple GARCH(1,1) model, assuming a conditional Normal distribution for the returns. After estimating the parameters, he obtains the fitted variances 𝜎𝑡2̂.
d) Describe the full procedure how to test if the volatility model is correctly specified given the fitted volatilities? [Be explicit] (2 pt)
1) Compute the square of the standardized residuals 𝑢̂𝑡 with
𝑢̂𝑡=(𝑟𝑡−𝜇)/𝜎𝑡̂
(1 pt)
2) Check if there is any autocorrelation left in the squared values of 𝑢̂𝑡 with a Ljung Box test. (1 pt)
e) Explain what is meant by an omitted variables bias?
It is a bias in the coefficients of the regressors included in the model due to a regressor that impacts BM but that is left out of the model.
a) What does the 𝑅^2 of this regression tell you?
The % of explained variation of the dependent variable by the regression model
Unbiasedness of OLS
THEOREM (normality of OLS ^ ): If
Assumption 0a (correct specication)
Assumption 0b (no multi-collinearity)
Assumption 1 (zero mean errors)
Assumption 2 (homoskedasticity)
Assumption 3 (uncorrelated errors)
Assumption 4 (regressors not stochastic)
Assumption 5 (normality)
Violation 1: heteroskedasticity
Test: Engle LM test, LB on squared residuals (but also other possibilities)
Violation 2: autocorrelation
Test: Ljung-Box or Box-Pierce or others
Violation 2 (alt): non-normality, or even endogeneity (omitted variables)
Test (alt): Bera-Jarque, or adding additional risk factors
Looking at the results of your regression, you see that none of the coefficients has an absolute t-value
higher than 1.5.
c) Argue why you can/cannot conclude from this that the 5 risk factors jointly fail to adequately
describe the risk in “dedicated short bias” hedge fund returns.
You cannot conclude this. Testing for multiple restrictions at the same time requires an F-test
This is particularly important if there are multicollinearity issues. You may expect these to be absent here given the 5 risk factors.
d) Using HAC standard errors, the significance of the individual coefficients drops further. Explain
how this drop in significance by the use of HAC standard errors may come about.
d)
If the regression residuals are large in absolute magnitude at the same time as the regressors are far from their mean. Alternatively, if errors are positively correlated over time.
You repeat your regression for the “event driven” hedge fund returns and obtain different
coefficients than for “dedicated short bias”.
e) Explain how you will test whether these differences are statistically significant
[hint: what alternative regression or what additional auxiliary regression is needed to do this]
e)
You can run a pooled regression for the two styles, store the SSR as SSR0. The sum of the SSR’s for the separate regressions is SSR1. Now you can make an F-test.
You can also make a joint regression model with a dummy interaction for the DSB style, and do an F-test. This amounts to doing a Chow break test.
Going back to the regression results for the “dedicated short bias”, you regress the “dedicated short bias” returns on a constant and on the market excess return 𝑀𝑀𝑡𝑡 only. You find a regression
coefficient for the market return of -0.51. You worry that you might have forgotten an explanatory
variable measuring the illiquidity climate of the market. You expect this liquidity variable to have a
direct negative impact both on the hedge fund return and the market return.
f) Explain intuitively why this omitted regressor may bias the coefficient estimate for 𝑀𝑀𝑡𝑡.
g) Argue what is the direction of the bias, i.e., why you expect the true coefficient to be more negative than −0.51, or why you expect it to be less negative (or even positive).
Part of the effect of the omitted illiquidity on HF return now runs through the included market return variable.
(mention: effect via market return variable,)
h) Argue why in the above regressions it would / would not be recommended to use robust standard errors rather than common standard errors given the data set (returns (panel) and risk factors (time series)) at hand.
Returns are often known to be heteroskedastic due to volatility clustering, so heteroscedasticity robust standard errors make sense to prevent flawed inference.
Suppose that you as a company has a portfolio that contains all 500 assets. You would like to compute the minimum required capital. You are given the EAD of each company, denoted by 𝐸𝐸𝐸𝐸𝐸𝐸 𝑖𝑖 . Assume a LGD of 50% and assume that the fitted probabilities of default from your Logit model are the true Basel based probabilities.
You would like to account for parameter uncertainty while computing the minimum required capital.
f) Provide an argument why it makes sense to account for this uncertainty? You may use the output from Table 1.1 (1 pt)
g) Describe the full procedure how to account for parameter uncertainty when calculating this minimum requirement ? (3 pt)
E.g: By neglecting the parameter uncertainty you don’t take into account that the precision of one certain estimated coefficient is much higher than the other. This will affect in the probability of default and in the end the distribution of the min req capital.
g) Describe the full procedure how to account for parameter uncertainty when calculating this minimum requirement ? (3 pt)
𝑀𝑀𝑀 𝐶𝐶𝑖𝑖= 𝑃𝑃𝐷𝐷𝑖𝑖×𝐸 𝐷𝐷𝑖𝑖×𝐿𝐿𝐿𝐿𝐿 𝑀𝑀𝑀 𝐶𝐶𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃=Σ𝑀𝑀𝑀 𝐶𝐶𝑖𝑖
The procedure reads
1) Estimate the logit model and store the parameter vector 𝛽𝛽̂ and the covariance matrix 𝑉
2) Simulate 𝛽𝛽𝑗𝑗 from 𝑁𝑁(𝛽𝛽̂,𝑉 )
3) Compute 𝑃𝑃𝐷𝐷𝑖𝑖𝑗𝑗 for each company via 𝑃𝑃𝐷𝐷𝑖𝑖𝑗𝑗= exp (𝑥𝑥𝑖𝑖′𝛽𝛽𝑗𝑗)1+ exp (𝑥𝑥𝑖𝑖′𝛽𝛽𝑗𝑗)
4) Compute 𝑀𝑀𝑀 𝐶𝐶𝑖𝑖𝑗𝑗= 𝑃𝑃𝐷𝐷𝑖𝑖𝑗𝑗×𝐸 𝐷𝐷𝑖𝑖×𝐿𝐿𝐿𝐿𝐿
5) Compute 𝑀𝑀𝑀 𝐶𝐶𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑗𝑗=Σ 𝑀𝑀𝑀 𝐶𝐶𝑖𝑖𝑗𝑗
6) Repeat steps 2 until 5 N times (j = 1,… N)
In the end, we get N simulated MRC values of the portfolio.
The positive found relationship between ln(wage) and education can be a result of just correlation or a real causal effect.
c) Explain the difference between correlation and causality ? (1 pt)
Correlation is a statistical measure of dependency between two variables (without any reasoning/theory). (0.5 pt)
Causality means that y goes of because x changes due to a certain (economic) relationship between y and x. (0.5 pt)
The estimated coefficient of 𝛽𝛽1 could be biased due to endogeneity problems. For example, we do not include 𝐴 Ability𝑖 (since we can not measure this variable).
d) Explain why the above example could lead to a biased coefficient 𝛽𝛽1? (1 pt)
Rosy states that by using panel data, the aforementioned bias problem could vanish.
e) Is Rosy right? Argue WHY or WHY NOT? (1 pt)
Ability has an effect on education such that we omit a variable here. This could lead to an endogeneity problem. (correlation between X and the error term) (0.5 pt)
Since Ability is correlated with education, we indeed will get a biased beta, (0.5 pt)
Note: you should mention that ability is correlated with education!!!
Yes, because Ability𝑖 does not change over time! (0.5 pt)
Hence by using panel data with individual specific effects, this affect will be mopped up by individual specific intercept. (0.5 pt)
You are given panel data of 100 S&P 500 listed firms, which could be grouped into 10 different industries. You study a famous corporate finance subject whether the composition of the board influences the leverage of the firm. A theory from sociology says that men would like to take more risk than women at a board.
You run the following regression model:
EQ 3.1 𝐿𝐿𝐿𝐿𝑣𝑣𝑖𝑖𝑖 =𝛽𝛽0+𝛽𝛽1 𝐵𝐵𝐵 𝐷𝐷𝑖𝑖𝑖 + 𝑥𝑥𝑖𝑖𝑖 ′𝛽𝛽+ 𝜖𝜖𝑖𝑖𝑖
with t in years (1990 – 2017) and 𝐵𝐵𝐵 𝐷𝐷𝑖𝑖𝑖 the percentage of males at the board of company i at the end of year t, and 𝑥𝑥𝑖𝑖𝑖 ′ a row vector containing several control variables. Standard errors are computed using the normal OLS standard errors. You find a positive 𝛽𝛽1 of 0.50 with a standard error equal to 0.10.
a) Argue why the usual OLS standard errors are possibly wrong here? Also provide a possible solution? (2 pt)
OLS standard errors assume that there is no correlation between 𝜖𝜖𝑖𝑖,𝑡 𝑎 𝑎𝑎𝑎 𝜖𝜖𝑖𝑖,𝑡 +1. However, it could be that the Leverage values and/or BGD values of a certain company i does not vary that much over time, such that 𝜖𝜖𝑖𝑖,𝑡 and 𝜖𝜖𝑖𝑖,𝑡 +1 are correlated! (1 pt)
Hence one should use clustered standard errors, and then cluster on firm (1 pt)
ALSO OK: Heteroskedasticity (1 pt): however recall that robust standard errors (vce robust) does NOT solve the problem of correlation between 𝜖𝜖𝑖𝑖,𝑡 𝑎 𝑎𝑎𝑎 𝜖𝜖𝑖𝑖,𝑡 +1
Note: HAC/White is wrong here as we deal with panel data!
You use the data until 2009 to estimate the parameters. The remaining data is used as out-of-sample observations. After generating 2-step ahead forecasts, you compute the associated forecast errors.
e) Do you expect that the MSPE of the AR(0) will be larger or lower than the MSPE of the ARMA(1,1) model? Explain your result! (1 pt)
Since there is a lot of autocorrelation, this means that the ARMA(1,1) will capture a part of it for sure such that the forecasts are much better than the forecasts of a model without NO autocorrelation (the AR-0). Hence the MSPE is expected to be lower of the ARMA(1,1) model. (1 pt)
Keyword: autocorrelation, that is not captured by AR(0)
You discover that there is another risk factor out there in the literature, namely stock return differentials between portfolios of low minus high liquidity stocks. You expect this factor to carry a positive risk premium. You expect the stock liquidity to be correlated with market capitalization (i.e., with size): smallcap stocks are less liquid on average.
g) Explain the direction of the omitted variables bias you expect when regressing 𝑟𝑟𝑖𝑖𝑖 on the four risk factors 𝑟𝑟𝑡𝑡𝑀𝑀,𝑆𝑆𝑆 𝑆𝑆𝑡𝑡,𝐻𝐻𝐻 𝐻𝐻𝑡𝑡,𝑈𝑈𝑈 𝑈𝑈𝑡𝑡, only, thus omitting the liquidity risk factor. (2pt)
You apparently expect a positive impact on the omitted risk factor, and a positive correlation between low liquidity and low market cap, so a positive correlation between SMB and the new risk factor. Therefore, SMB also captures part of the positive (1pt) impact of the omitted risk factor on the returns, and the SMB coefficient will be biased upwards (have a positive bias). (1pt)