8. Estimation of Parameters and Fitting of Probability Distributions Flashcards
Estimator
Definition
-suppose we have X_=X1,X2,…,Xn drawn from a distribution with some parameter θ
-as estimator θ^n of θ is a function of the observed data which (we hope) forms a useful approximation to the parameter:
θn^ = g(X1,X2,…,Xn)
-note that θn^ can depend only on the observed data and not on any unknown parameters
Estimator vs. Estimate
- given a random sample X_=(X1,X2,…,Xn), as estimator for parameter θ is given by θ^=g(X_)
- once we have real observed sample x_=(x1,x2,…,xn) and θ^=g(x_) is an estimate for the parameter of interest
- the estimator is the expression for the parameter, once we substitute in real data and obtain an actual number as a prediction for θ it becomes an estimate
What are the two methods for point estimation of a parameter?
1) method of moments
2) method of maximum likelihood
Method of Moments
-the number of parameters determines how many moments you will use
1) write down expressions for the moment(s) in terms of the parameter(s)
2) rearrage for the parameter(s) in terms of the moment(s)
3) sub in the sample moments:
M1 = 1/n ΣXi = Xbar
M2 = 1/n ΣXi²
Method of Moments
One Parameter
1) write out the expression for the first moment, M1 in terms of the unknown parameter
2) rearrange for the unknown parameter in terms of the first moment
3) sub in the sample moment M1 = 1/n ΣXi = Xbar for an expression for the estimator
Method of Moments
Two Parameters
1) write down expressions for the first and second moments M1 and M2 in terms of the two unknown parameters
2) rearrange and sub in so you have both unknown parameters expressed only in terms of the first and second moments
3) sub in the sample moments M1 = 1/n ΣXi² = Xbar and M2 = 1/n ΣXi²
What is the limitation of the method of moments?
- the method of moments does not take into account the shape of the underlying distribution
- the method of maximum likelihood does take into account the probability function of the population
Maximum Likelihood
Outline
-the basic ides starts with a joint distribution of X_=X1,X2,…,Xn depending on the parameter θ:
f(x_|θ) = f(x1,x2,…,xn;θ)
-for fixed θ, probability statements can be made about X
-if we have observations x_ but θ is unknown, we regard information about θ as being contained in the likelihood:
l(θ,x_) = f(x_,θ)
Likelihood
Definition
-we define the likelihood of parameter θ given an observed sample x_=(x1,x2,…,xn) as:
l(θ,x_) ∝ Π fx(xi,θ)
-that is the likelihood of each possible value of θ is the probability of this value θ given the observed sample
Maximum Likelihood Estimator
-the maximum likelihood estimator (MLE) of θ is the value of θ that maximises the likelihood function, i.e. the value of θ that makes the observed data most probable
Maximum Likelihood
Invariance Property
-if θ^ is the maximum likelihood estimator for θ, then:
g(θ^)=[g(θ)]^
-is the maximum likelihood estimator for g(θ)
Log Likelihood
-instead of maximising the likelihood, it is sometimes easier to maximise log likelihood
-this gives the same value of θ^ as ln() is a monotonically increasing transformation
-log likelihood is defined as:
L(θ,x_) = ln( l(θ;x_) )
Unbiased Estimator
Definition
-an estimator is unbiased if:
E(θ^) = θ
-i.e. if we took many random samples x_ and calculated corresponding estimates θ^, on average we would get the true value θ
Bias
Definition
Bias(θ^) = E(θ^) - θ
- if bias(θ^)=0, the estimator is unbiased
- if bias(θ^)>0, the estimator is positively biased, tends to overestimate θ
- if bias(θ^)<0, the estimator is negatively biased, tends to underestimate θ
Mean Squared Error
Definition
-for a given estimator θ^=g(X_), we define the mean squared error:
MSE(θ^) = E[(θ^-θ)²]
-this is a measure of accuraccy