Quiz 2 Flashcards
Six key regression assumptions
Model linear in the parameters Random sampling No perfect collinearity Zero conditional mean assumption Homoscedasticity Errors independent of covariates and normally distributed
How to evaluate goodness of fit in Poisson regression?
Use deviance (-2 log likelihood fitted/likelihood saturated)
How to do Poisson regression for rates?
Replace the mean count with lambda_i * t_i (the rate times the amount of time)
Assumptions for Poisson regression for rates
For rates: we assume the rate is constant over time within an individual
For rates from summary data: in addition, we assume that each individual in a group follows a Poisson distribution with the same mean
How do we do Poisson regression with standardized rates?
Calculate an expected count for each group based on the standardization variables, and use this as an offset term
How do we assess overdispersion in Poisson regression?
Use standardized residuals, which should be appx independent with mean 0 and variance 1. Can evaluate plot or use a test statistic where the sum of squared residuals is chi square distributed with df = n-p-1. To estimate the magnitude of overdispersion, we use the sum of this test statistic divided by n-p-1.
What is a problem with the log binomial model?
Because it is non-canonical, it is less stable than the logistic binomial model and may fail to converge, in which case you can fit a Poisson model with robust variance estimates.
Assumption for non-parametric survival analysis
Non-informative censoring
How to calculate the survivor function
Multiply 1-h(t_j) for all periods up to the one being considered
What happens to the mean survival time if the last observation is censored?
Results in an underestimate of the true mean
How to compare survival at a fixed time point?
Examine whether the survival curves for that time point have overlapping confidence intervals
What is the Kaplan-Meier estimator used for?
Used to estimate s(t) in the presence of censoring
What is the log-rank test used for?
Used to compare survival across the entire distribution
When do we need a Cox proportional hazards model?
When we have more than a single binary covariate
What does the baseline hazard in a Cox model represent?
The underlying hazard when all covariates are equal to zero (not estimated)
What is the distinction for the likelihood estimation procedure used in a Cox model?
Partial maximum likelihood estimation – conditioned on the observed event times