706 exam Flashcards
What is the precision method way of determining a sample size?
Trying to establish a sample size to meet a requirement on the precision of estimates (as measured by confidence intervals)
What is the equation for SE?
How do you calcuate the sample mean of a binary variable?
The sample mean of a binary variable Y estimates the probability that Y=1. For binary variables the sample mean is a proportion and the proportion estimates the probability.
If you code ethnicity 1, 2 and 3- can you then include this coded variable as a predictor in a regression model?
No. Doing this forces a structure on the model that is unlikely to be true. The 1v 3 effect is twice the effect of 1 v2.
What is logistic regression and what is the statistical framework?
The outcome in logistic regression is binary and uses counts of occurance. The binomial model provides the logistic framework.
What is the problems associated with missing data?
- Loss of statistical power
- distortion of analyses
- Create bias
How do you calculate a confidence interval?
q +/- 1.96 SE (q)
What is a t-statistic?
The calculated difference represented in units of standard error. The greater the magnitude of T, the greater the evidence against the null hypothesis.
How do you calculate relative risk?
Probability of an event occuring for group A divided by probability for group B
What is a chi-square test?
The chi-square test is used to assess whether two categorical variables are unrelated to each other. The ‘chi-square statistic’ is a measure of the discrepancy between expected and observed cell values. A measure of the discrepancy between expected and observed “chi-square statistic” χ2. If χ2 is large it indicates a big discrepancy between what we observed and what we would have expected under the hypothesis of independence.
How do you calculate the odds ratio from a coefficent in a regression analysis?
exp
How do you get the probability of two things happening simultaneously?
Multiply
how does p relate to x in a logistic model?
p is always between 0 and 1
What is the central limit theorem?
The sampling distribution of sample means tend to a normal distribution as n gets large, regardless of underlying distribution
What does a MSE of 17 tell you for a model?
For a given combination of factors the actual values will be distributed +/- 34 units about the mean value.
What are the key assumptions of ordinary regression?
- n observations are independent of each other
- the effects add together
- the residuals are normally distributed with constant variance. You can check this with a Q-Q plot
What judgements do you make when doing a regression model?
Modelling requires judgements about how to include variables: should continuous variables be categorized, should dummy variables be used for ordinal scale, which variables should be included in the model, should those that are not statistically significant be dropped, should interaction terms be included.
How do you increase the power of a study?
- Sample size
- The size of the effect
- Significance level
- The endpoint being studied
- The statistical test being used. Ie generally if the assumptions of a parametric statistical test hold, the parametric statistical test will be more powerful than a non-parametric one. A parametric test is based on tests of the parameters of normal distribution so are based on the assumption that the underlying distribution is normal.