Week 6 Flashcards
What are the steps to find the ML-estimator?
Differentiate (if you want the natural log) the PDF, and then equate to 0.
What do we optimize for in the Least Squares, Best Unbiased, and Maximum Likelyhood Estimation?
Least Squares: Squared deviations
BLUE: the variance (s.t. some constraints)
MLE: (Maximize) the likelyhood function
What is the difference between the Likelyhood function and the CDF?
The interpretation differs, in the likelyhood function you change the parameter to find the most likely true value, where as with the CDF you look what the probability is given some parameters.
What are the ML-estimators for a linear regression on a Normal distribution?
beta hat = (X’X)-1X’y
sigma hat = (y - X beta hat)’(y - X beta hat)/n
What is invariance, and what does it have to do with the ML-estimator?
If theta hat is the ML-estimator of theta, then g(theta hat) is the ML-estimator of g(theta).
How to calculate the information matrix?
F = -E(H), where H is the second derivative of the Likelyhood function.
What is the first order regularity?
E(q) = 0
What is the second order regularity?
var(q) = -E(H) = F
How are the First and Second proven?
Let f be the PDF, s.t.
∫ f dy = 1
Thus we can do the following:
∫∂f/∂𝜃 dy = ∂/∂𝜃 ∫ f dy = ∂1 / ∂𝜃 = 0 and
∫ ∂2f/∂𝜃2 dy = 0 (using same trick)
Thus if you consider:
∂ ln f / ∂𝜃 = 1/f ∂f/∂𝜃 and ∂2 ln f/∂𝜃2 = ∂/∂𝜃 (1/f ∂f/∂𝜃) = (∂f-1/∂𝜃) ∂f/∂𝜃 + 1/f ∂2f/∂𝜃2 = -1/f2 (∂f/∂𝜃)2 + 1/f ∂2f/∂𝜃2 = -(∂ ln f/∂𝜃)2 + 1/f ∂2f/∂𝜃2
If we take the expectations we get:
E(∂ ln f/∂𝜃) = ∫∂ln f/∂𝜃 dy) = ∫∂f/∂𝜃 dy = 0
E(∂2 ln f/∂𝜃2) = -E(∂ln f/ ∂𝜃)2 + ∂2f/∂𝜃2 dy = -E(∂ln f/ ∂𝜃)2
What does the Cramer-Rao inequality say?
It says that the variance of an unbiased estimator has a lower bound which is given by the inverse of var(q) or equivalently by F-1. If an unbiased estimator achieves the Cramer-Rao bound then it has the lowest possible variance, a.k.a. efficient.
Is the Least Squares beta estimator efficient?
Yes, by Cramer-Rao it is not only the best unbiased linear estimator, but also the best unbiased estimator.
Is the Least Squares sigma estimator efficient?
Altough it doesn’t attain the Cramer-Rao lower bound, it is efficient in the sense that there’s no other unbiased estimator for sigma that is unbiased.
What is the proof of the Cramer-Rao lower bound (for a single estimator)?
We know that E(q) = 0, var(q) = F, var (beta hat, sigma2 hat) ≈ [var(q)]-1 = F-1
Suppose we have unbiased estimator t of a parameter 𝜃, s.t.
E(t) = ∫(tf)dy = 𝜃, this implies ∂E(t)/∂𝜃 =1, letting q = ∂ ln f/∂𝜃, implies:
1 = ∂/∂𝜃 ∫(tf)dy = ∫∂(tf)/∂𝜃 dy = ∫∂t/∂𝜃 f dy + ∫t ∂f/∂𝜃 dy = ∫t ∂f/∂𝜃 dy = ∫(tq) f dy = E(tq) = cov(t, q)
We used the fact that ∂t/∂𝜃 = 0. If we consider z = t - 𝜃 - 𝛼q, where 𝛼 us a constant to be chosen later. Since t is unbiased and E(q) = 0. We have E(z) = 0. Also:
0 ≤ var(z) - 2/var(q) + 1/var(q) = var(t) - 1/var(q), thus var(t) ≥ 1/var(q).
How are the ML-estimators found when restrictions are set?
First we define the Lagrangian: 𝜓(β, σ2) = ln L(β, σ2) - l’(Rβ - r), where l’ = (λ1, λ2, λ3, …)’ is a vector of m Lagrangian multipliers. Differentiating and equation to 0 gives the following system:
(X’y - X’Xβ)’/sigma2 tilde = l tilde’R,
(y - Xβ tilde)’(y - Xβ tilde)/(2σ2 tilde),
R β tilde = r
We can obtain from this:
β tilde = β hat σ2tilde(X’X)-1R’l tilde, thus:
r = R β tilde = R β hat - σ2 tilde(X’X)-1R’ l tilde.
After which follows:
β tilde = β hat - (X’X)-1R’(R(X’X)-1R’)-1(Rβ hat - r), and
σ2 tilde = (y- Xβtilde)’(y - Xβ tilde)/n
What is a much easier way of finding the ML-estimators using Vectors and Matrices?
Find the inverse of G =
(X’X R’)
(R 0)
which is:
((X’X)-1- (X’X)-1R’A-1R(X’X)-1 (X’X)-1R’A-1)
(A-1R(X’X)-1 -A-1)
where A = R(X’X)-1R’