Statistical Inference: Section 3 Flashcards
What is the likelihood?(1)
We will define the likelihood L(μ|x) for a particular value of μ given a vector of observed data x to be equal to the probability of getting that set of data, given the candidate value of μ. We write: L(μ|x)≡f(x|μ), wheref(.) refers to the probability mass function of the data.
What us important to note about the difference between the likelihood and pmf?(1)
The switch in the direction of the conditioning is crucial. The probability mass function f(x|μ) is a function of x with μ fixed. The sum over all possible values of x is one. But the likelihood is a function of μ with x fixed. It is not a probability or probability density and it won’t sum/integrate to one over all possible μ.
How to calcualte a poisson pmf/ Pr(X=x|mu)?(1)
(mu^x*e^-mu)/x!
How do you calculate likelihood?(1)
Product of all the pdfs IF independent.
Means issue as numbers get very small, which is when log likelihood comes in!
Which is always the natural log (ie to log e)
Why is log likelihood usually negative?(1)
As taking for values between 0 and 1, can have pdfs higher than this but very unlikely.
How do you find the max likelihood?(3)
Find log likelihood via the pdfs
differentiate
set dy/dx=0 and solve.
To check its a maximum find second derivative, if less than or equal to 0 then maximum.
Define the Score Statistic.(1)
U(X, θ)
Defined as the first derivative of the log-likelihood function with respect to theta.
Define the Expected/Fisher information. Why is it important?(2)
The information is defined as the second derivative so the negative expectation is defined as expected information:
I(θ) = −E(second derivative of log-likelihood).
A higher absolute means you have a more concentrated graph about the mean therefore more information and more certainty on the estimator.
What is the Cramer-Rao inequality? Importance?(2)
V ar(ˆθ) ≥1/I(θ) Subject to regularity conditions
Right hand side gives the minimum variance bound/CRB
Define efficiency in terms of information. Asymptotically efficient?(2)
Var(thetahat)=1/I(theta)
Asymptotic if Var(thetahat)/CRB=>1 as n=>infinity
ie Var(thetahat)I(theta)=>1 as n=>infinity
Asymptotic distribution as n=>infinity. Importance?(3)
Under regularity conditions
θ^ ∼ N(θ, 1/I(θ)).
ie the maximum likelihood distribution for n large enough (usually n>30) tends to a normal with mean theta and variance 1/information meaning max efficiency and asymptotically unbiased (behaves optimally)
Allows for approximations.
What is the standard error of the estimator?(1)
sqrt(1/I(θ) )= 1/sqrt(I(θ))
How to get from cdf to pdf of a distribution?(1)
Differentiate cdf.
How do you generally find a confidence interval (CI)?(1)
Estimate+-something *sd
where sd=(sigma^2)/n
How would you calculate a sample size for a known population variance, given you want a specific class width?(1)
Minus the CIs and make it equal to the desired class width.