Parametric models Flashcards
Univariate families of distributions
answer
Relationship between Poisson and binomial
ThebinomialandPoissonare particularly close cousins. A Bi(n, pi) distribution (the number of heads in n independent flips of a coin with probability of heads (pi) approaches a Poi(n*pi) distribution as n grows large and pi small.
Multivariate normal distribution
Fisher’s information bound for multiparameter families
“Definition: A theoretical lower bound on the variance of unbiased estimators for multiple parameters, derived from the Fisher Information Matrix (FIM). It generalizes the Cramér-Rao bound to multiparameter cases.
Provides the minimum achievable variance for unbiased estimators in multiparameter settings.
The covariance matrix of any unbiased estimator θ^θ^ is at least the inverse of the Fisher Information Matrix.
Only applies to unbiased estimators.
Achieving the bound depends on the estimator and the model; often, efficient estimators (e.g., Maximum Likelihood Estimators) are required.
Maximum likelihood, and in fact any form of unbiased or nearly unbi-
ased estimation, pays a nuisance tax for the presence of “other” parameters.
Modern applications often involve thousands of others; think of regression
fits with too many predictors. In some circumstances, biased estimation
methods can reverse the situation, using the others to actually improve esti-
mation of a target parameter; see Chapter 6 on empirical Bayes techniques,
and Chapter 16 on `1 regularized regression models”
The multinomial distribution
“Definition: A generalization of the binomial distribution that models the probabilities of outcomes across kk categories from nn independent trials, where each trial has a fixed probability for each category.
Assumptions:
Fixed nn. Independent trials. Probabilities (pipi) are constant across trials.
Limitations:
Assumes no overlap between categories (mutual exclusivity). Not suitable for dependent trials or variable probabilities.
Example: Rolling a die n=10n=10 times and counting the outcomes for each of the six faces (k=6k=6, pi=1/6pi=1/6).
There is one more important thing to say about the multinomial family: it
contains all distributions on a sample space X composed of L discrete cat-
egories. In this sense it is a model for nonparametric inference on X . The
nonparametric bootstrap calculations of Chapter 10 use the multinomial in
this way. Nonparametrics, and the multinomial, have played a larger role
in the modern environment of large, difficult to model, data sets.”
Exponential families
“Why are we interested in exponential tilting rather than some other trans-
formational form? The answer has to do with repeated sampling.
No matter how large nmay be, the statistician can still compress all the
inferential information into a p-dimensional statistic y-hat. Only exponential families enjoy this property.
All of the classic exponential families have closed-form expressions for the normalizing function and the caryying density, yielding pleasant formulas for the mean ˇ and covar-
iance V
Exponential family of distributions are extremely useful for statistical analysis. Often only the exponential family has these properties. Examples:
Tthe only families with sufficient statistics that can summarize arbitrary amounts of independent identically distributed data using a fixed number of values. (Pitman–Koopman–Darmois theorem)
Conjugate priors, an important property in Bayesian statistics.
The posterior predictive distribution of a random variable with a conjugate prior can always be written in closed form
- Generalized Linear Models (GLMs)
Exponential families form the basis of Generalized Linear Models (GLMs). In GLMs:
The response variable follows a distribution from the exponential family.
The link function transforms the mean of the response variable to a linear function of the predictors.
The exponential family includes a wide variety of distributions (normal, binomial, Poisson, etc.), making it flexible for diverse modeling tasks. Its form is adaptable to different types of data (continuous, discrete, binary, categorical).”