Taylor and McGuire Flashcards
Mean and variance of exponential dispersion family (EDF) of distributions
mean = mu
variance = dispersion parameter * variance function
Variance function for the Tweedie sub-family of EDFs
variance = mu ^ p
And p between 0 and 1 (inclusive)
in words: variance is proportional to a power of the mean
Relationship between p for the Tweedie distribution and tail heaviness
tail heaviness increases as p increases
Mean and variance of a Tweedie distribution
mean = mu = [ ( 1 - p ) * theta ] ^ ( 1 / ( 1 - p ) )
where theta = location parameter
variance = dispersion parameter * mu ^ p
General GLM format (in matrix notation)
Taylor and McGuire
link function = transposed covariate matrix * beta matrix
where betas are the linear response variables, and the link function transforms the mean of each observation into a linear function of the parameters (betas)
Conditions for the structure of a GLM (3)
Taylor and McGuire
- each observation is a member of the EDF
- h(mu-sub i) = x-sub i-transpose * beta
- observations are stochastically independent
GLM version of a standard linear regression
mean, mu = sumproduct ( x, beta)
Underlying assumptions of a standard linear regression (3)
- errors are normally distributed
- errors have constant variance
- linear relationship
Difference b/w weighted linear regression and standard linear regression
weighted linear regression recognizes errors might have unequal variances
Model generalizations to get from a linear regression to a GLM (2)
- non-linear relationship
2. non-normal errors
Common estimation method for GLM parameters
MLE
Requirements for selection of a GLM and purpose of each (4)
selection of:
- cumulant function (controls the shape of the distribution)
- index, p (controls relationship b/w mean and variance in an EDF)
- covariates (x’s = explanatory variables)
- link function (controls relationship b/w mean and covariates)
Measure of model goodness of fit
Taylor and McGuire
deviance
> > smaller = better
Deviance formula (unscaled)
deviance = 2 * sum ( log-likelihood (perfect model ) - log-likelihood ( actual model ) )
Scale parameter calculated from deviance and corresponding distribution
(Taylor and McGuire)
scale parameter = deviance / ( n - p )
> > Chi-square distribution w/ ( n - p ) df