Principles of data reduction Flashcards
Define a sufficient statistic
A sufficient statistic is a summary of the observed data containing all the
information needed to infer the parameter of interest. - contains sufficient information to calculate the likelihood function.
What is the fisher neyman factorisation for sufficient statistics
A statistic T (X) is sufficient for θ if and only if there exist two functions g and
h such that:
Lx (θ) = h(x) · g (T(x), θ)
Note x and theta can be vectors
How to find a sufficient statistic
- write down liklihood
- examine how it depends on the data
- determine form this what the sufficient statistic could be
Try to rewrite liklihood at fisher neyman factorisation with this sufficient statistic
Define a minimal sufficient statistic
A sufficient statistic is called minimal sufficient statistic if it can be written as
a function of any other sufficient statistic I can create. This information the statistic contains is enough to calculate the likelihood but is as summarised as possible, least info needed
Are minimal sufficient or sufficient statistics unique
No in fact
they maintain their property after linear transformations
What two statements have to be equivalent for T to be a minimal statistic for theta
P (X = x|θ) /P (X = y|θ) does not depend on θ;
T (x) = T (y)
How will you know in a question if the likelihood ratio does not depend on theta is the same as Tx=Ty
The dependency on the data and/or parameters disappears if and only if tx=ty
What theorem gives a strong rationale for basing estimators on sufficient
statistics, if they exist
Rao blackwell theorem
How does rao blackwell theorem suggest that sufficient statistics should be the basis for estimators
If an estimator is not constructed using a sufficient statistic, then
it can be theoretically improved by considering its expectation conditionally on
the sufficient statistic and could have a smaller MSE in theory
What estimator should we use to incorporate the sufficient statistic
Using the maximum likelihood estimator, this is naturally constructed using a
sufficient statistic.