Unit 7 Flashcards
Note
Worth reading pages 1-4 for understanding
If α is an unknown parameter, how is the likelihood of the data denoted?
f(X1,…,Xn|α) (line shows likelihood depends on α)
Then define
L(α) = f(X1,…,Xn|α)
How can we develop L(α) once we assume that Xi are independent?
We can then see the likelihood is the product of the individual densities
L(α) = f(X1|α)…f(Xn|α)
What is the idea of MLE?
To choose the estimate of α to make the likelihood L(α) as large as possible!
Ie. To make the observed data as likely as possible!
How to optimise L(α)?
L=objective function
α=choice variable
Differentiate L wrt α, equate to 0 then obtain α estimate, & check with 2nd order conditions it is a maximum point
What can we do to avoid algebraic mess?
Since log(x) increases with x, L(α) will take its highest value of α at same point as logL(α) tf can take logs of both sides - log-likelihood function
What can we say about MLE estimators in large samples?
They are normally distributed with mean θ (true value of parameter) and variance CRLB (Cramer-Rao Lower Bound) (don’t need to know how to calculate)
How do we know MLE are asymptotically unbiased?
As n->infinity, E(θestimate) -> θ
See optional material pp.11-14 and summary and other examples 15->
Now