Financial Econometrics Flashcards
Efficient market implications
(1) Prices fully reflect all available information, so price changes are random
(2) Prices follow random walks
(3) Trade-off between risk and expected return, so average abnormal returns is zero.
(4) No free lunch, no arbitrage
(5) ‘Active’ management does not add value
Definition Market Efficiency
“A market where prices always ‘fully reflect’ available information is called efficient.” Fama (1970)
Relevance of the EMH (from Nobel Prize 2013):
“In the 1960s, Eugene Fama demonstrated that stock price movements are impossible
to predict in the short-term and that new information affects prices almost
immediately, which means that the market is efficient. The impact of Eugen Fama’s,
results has extended beyond the field of research. For example, his results influenced
the development of index funds.”
Why is it difficult to test for market efficiency?
According to Fama (2014), we cannot test whether the mar-ket does what it is supposed to do unless we specify what it is supposed to do. This is the joint hypothesis problem. Particularly,
(1) To test the EMH you need to specify expected prices or expected returns. That is, you need a model of how investors build expectations about future prices or returns.
(2) Testing EMH we really test whether expected returns implied by a model are actually observed.
(3) If the test is rejected, it could be the case that the model was wrong, the EMH is wrong or both.
What are the three common approaches for parameter estimation?
(1) OLS: Minimise the squared distance between data and model parameters (paremeters are the intercept and slope coefficient)
(2) MLE: Maximise the likelihood of parameters for the observed data
(3) GMM: General method of moments. Solves a set of moment conditions, e.g., for μ and σ.
Least square: Methodology
Least squares minimise the squared distance between data and model parameters. It can be used to fit a line to a scatter of data. The line has the formula, where b is the slope and a the intercept y = a + bx
“Fit”: explain / predict the value of y (dependent, endogenous variable) that is associated with a given value of x (explanatory, independent, exogenous variable). The least squares criterion is
S(a,b) = ∑e_i^2 =∑(y_i-a-bx_i )^2
We want to minimise with respect to a and b and build first derivative of both.
We obtain
a = y ̅ - b x ̅
b= ∑(x_i-x ̅ )(y_i-y ̅)/∑(x_i-x ̅ )^2)
hat is the average
Least Square: Residuals
A residual is the error term associated with an individual observation. It represents the difference between the actual value of the dependent variable and the value predicted by the regression model for that particular independent variable value.
e_i = y_i - a - bx_i
The residuals e_i have a zero mean and are uncorrelated with the explanatory variable, that is
∑e_i =0, ∑(x_i-x ̅) e_i = 0
Least Square: R^2
Measures the model performance by comparing the sum of squared residuals with the sum of squares of (y_i-y ̅).
Total sum of squares = explained sum of squares + sum of squared residuals
R^2 = SSE/SST = (b^2 ∑(x_i-x ̅)^2 /∑(y_i-y ̅)^2)
R^2 is equal to the square of the correlation coefficient between x & y. Note that 0 ≤ R^2 ≤ 1. R^2 tells us how well the independent variable(s) explain the variability of the dependent variable around its mean. Further, the LS criterion is equivalent to maximising R^2.
Least Square: Assumptions
A1: The data generating process is linear, so y_i=α+βx_i+ε_i
A2: The n observations of x_i are fixed numbers and not random variables
A3: The n error terms ε_i are random with E[ε_i ]=0
A4: The variance of n errors is fixed, E[ε_i^2 ]=σ^2, so homoscedasticity and no serial-correlation
A5: The errors are uncorrelated E[ε_i,ε_j ]=0 for j≠I
A6: α and β are unknown but fixed for all n observations.
A7: The errors ε_1,…,ε_n are jointly normally distributed
Conclusion
With A3-A5, the error term is normally and identically distributed, i.e., ε_i∼NID(0,σ^2 ).
With A1-A7, the variation in y_i is partly the effect of variation in x_i, i.e., y_i∼NID(α+βx_i,σ^2 )
Least Square: Significance test
A regression model explains variation in y through variation in x. This requires y is truly related to x, i.e., β ≠ 0.
The problem is we can estimate b ≠ 0 even in cases when β = 0. So, we want to test the null hypothesis H_0: β=0 against the alternative hypothesis H_a: β≠0.
We reject H_0 if b differs significantly from β. Under A1-A6, the estimate b is a random variable. To test whether it is significant, we need to consider its uncertainty.
We use the estimate S^2 = 1/(n-2)*∑e^2
Test statistic is then no longer normally distributed but follows a t-distribution
For β = 0, tb is called the t-value of b
Sb = standard error of b
S is the standard error pf the regression
Least Square: Rejection Criteria
The null hypothesis (β=0) is rejected for the alternative hypothesis if b is ‘too far’ from zero, that is, if |t_b | is larger than a critical value c (|t_b |>c). Then b is significant. For a given significance level (usually 5%), the critical value is obtained from the t(n-2) distribution.
E.g. With n=30, we have c_(5%,n=30)=2.05; for n=60 we have c_(5%,n=60)=2.00; for n→∞ the critical value converges to 1.96.
Difference univariate and multivariate regression
Univariate regression model: we assume the dependent variable is related to only one explanatory variable.
Multivariate regression model: contains more than one explanatory variable
Reason: Many factors might influence stock returns. The effect of each other potential factor could be estimated by a univariate regression model, but explanatory variables might be mutually related. Such mutual dependence is accounted for in a multivariate regression model that contains more than one explanatory variable
Adjusted R^2
When you include many explanatory variables into the model (large k), R^2 will increase. The adjusted R^2 (or R ̅^2) contains a punishment for including many explanatory variables. The punishment also depends on the sample size. It is given as
R ̅^2= 1-(1-R^2 )*(n-1)/(n-(k-1)-1)
The adjusted R-squared can be negative and never larger than R-squared. It increases only when the increase in R-squared due to inclusion of a new explanatory variable is more than expected by chance.
Multivariate Least Square Estimation - OLS in three steps
- Choice of variables: Choose the variable to be explained (y) and the explanatory variable(s) (x_1,…,x_k). If you have more than one explanatory variable, x_1will often be the constant variable that replaces a from the univariate model
- Collect data: Collect n observations of y and related values of x_1,…,x_k. You will have y as an (n×1)-vector and the data on the explanatory variables in X which is a (n×k)-matrix.
- Compute the estimates: Use a regression package that computes the least square estimates.
Maximum Likelihood Estimation: Intuition
The intuition behind maximum likelihood estimation (MLE) is to maximise the likelihood of the estimated parameters given the observed outcomes. E.g., for a random sample y_1,…,y_n from N(μ,σ^2 ), the normal distribution with the largest likelihood is given by μ ̂=y ̅ and σ ̂=(1/n)∑(y_i-y ̅ )^2
E.g. You have actually observed outcomes of y and two different normal distributions. You now look where more of your actually observed outcomes are located. When more are located at the distribution on the right, this distribution has a larger likelihood then the one on the left.
Comparison of Methods
If your model is a linear equation (y_i=μ+ε_i, where ε_i∼”IID” (0,σ^2 )), OLS is the intuitive choice.
GMM (comparing population and sample moments) and OLS (comparing observed data and model prediction) are based on the idea to minimise a distance function.
ML is not minimising a distance but maximising the likelihood of the parameter values w.r.t. the observed data. It assumed the joint distribution of the data reflect the true data generating process. If data generating process is hard to determine use a less efficient but more robust method –> GMM