Statistical Inference: Section 3 Flashcards

1
Q

What is the likelihood?(1)

A

We will define the likelihood L(μ|x) for a particular value of μ given a vector of observed data x to be equal to the probability of getting that set of data, given the candidate value of μ. We write: L(μ|x)≡f(x|μ), wheref(.) refers to the probability mass function of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What us important to note about the difference between the likelihood and pmf?(1)

A

The switch in the direction of the conditioning is crucial. The probability mass function f(x|μ) is a function of x with μ fixed. The sum over all possible values of x is one. But the likelihood is a function of μ with x fixed. It is not a probability or probability density and it won’t sum/integrate to one over all possible μ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How to calcualte a poisson pmf/ Pr(X=x|mu)?(1)

A

(mu^x*e^-mu)/x!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you calculate likelihood?(1)

A

Product of all the pdfs IF independent.

Means issue as numbers get very small, which is when log likelihood comes in!
Which is always the natural log (ie to log e)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is log likelihood usually negative?(1)

A

As taking for values between 0 and 1, can have pdfs higher than this but very unlikely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you find the max likelihood?(3)

A

Find log likelihood via the pdfs
differentiate
set dy/dx=0 and solve.
To check its a maximum find second derivative, if less than or equal to 0 then maximum.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define the Score Statistic.(1)

A

U(X, θ)

Defined as the first derivative of the log-likelihood function with respect to theta.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define the Expected/Fisher information. Why is it important?(2)

A

The information is defined as the second derivative so the negative expectation is defined as expected information:
I(θ) = −E(second derivative of log-likelihood).

A higher absolute means you have a more concentrated graph about the mean therefore more information and more certainty on the estimator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the Cramer-Rao inequality? Importance?(2)

A
V ar(ˆθ) ≥1/I(θ)
Subject to regularity conditions

Right hand side gives the minimum variance bound/CRB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define efficiency in terms of information. Asymptotically efficient?(2)

A

Var(thetahat)=1/I(theta)

Asymptotic if Var(thetahat)/CRB=>1 as n=>infinity

ie Var(thetahat)I(theta)=>1 as n=>infinity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Asymptotic distribution as n=>infinity. Importance?(3)

A

Under regularity conditions
θ^ ∼ N(θ, 1/I(θ)).
ie the maximum likelihood distribution for n large enough (usually n>30) tends to a normal with mean theta and variance 1/information meaning max efficiency and asymptotically unbiased (behaves optimally)

Allows for approximations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the standard error of the estimator?(1)

A

sqrt(1/I(θ) )= 1/sqrt(I(θ))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to get from cdf to pdf of a distribution?(1)

A

Differentiate cdf.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you generally find a confidence interval (CI)?(1)

A

Estimate+-something *sd

where sd=(sigma^2)/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How would you calculate a sample size for a known population variance, given you want a specific class width?(1)

A

Minus the CIs and make it equal to the desired class width.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly