Stats exam Flashcards
Confidence interval definition
A 95% confidence interval is range of values that contain the true, unknown population mean with a probability of 0.95.
In repeated sampling, 95% of the confidence intervals calculated would include the true mean.
Reference range definition
A reference range is the range we would expect to contain 95% of values that an individual measurement from the population could take.
P value definition
The P value is the probability of having observed our data (or more extreme data) when the null hypothesis is true.
Standard error vs standard deviation
The standard deviation is a measure of variability in the population, whereas the standard error is a measure of the degree of uncertainty between the mean of the population and the mean of the sample.
Give three properties of standard error
-it is smaller for large samples than small samples
-it is less than the standard deviation (ie the variability of the individual observations in the population)
-it will increase as the standard deviation increases (ie as the variability among the individual values in the population increases)
Correlation coefficient definition
A measure of positive or negative strength of linear association between two continuous variables. It is represented as a straight line with a value between -1 and +1.
Three criteria of confounding
- A confounder is associated with the exposure of interest
- A confounder is independently associated with the outcome (i.e. a risk factor)
- A confounder is NOT on the causal pathway
Association definition
whether the distribution of one variable varies according to the value of the other variable
Categorical data that is independent should be analysed via… if the assumptions of this test are not met the data should be analysed by…
chi-squared test… Fisher’s exact test
Categorical data that is paired should be analysed via… if the assumptions of this test are not met, the data should be analysed by…
McNemar’s test… binomial based exact test
Format of test steps:
- set up a null hypothesis
- calculate a test statistic
- refer value of test statistic to the appropriate statistical table to obtain a p value
- calculate a confidence interval
For chi-squared how do you calculate degrees of freedom?
(r-1)(c-1)= degrees of freedom
chi-squared assumptions
- each subject is independent of all other subjects
- All expected cell counts are ≥1. That is no expected cell counts are 0.
- No more than 20% of cell counts are <5.
If none of the above hold up, you would do Fisher’s exact test.
NOTE THAT IT IS EXPECTED FREQUENCY COUNTS
What is required for McNemar’s test to be valid?
That b+c must add by at least 5 i.e. the discordant pairs must be at least 5
Drawbacks of chi-squared et al.
However the tests have drawbacks:
-If there is evidence against the null hypothesis of no association, they do not indicate the direction of the difference
-We cannot obtain an effect size when dealing with a binary outcome (e.g., 2x2 tables or more generally 2 x R tables), such as risk difference, risk ratio or odds ratio
-We obtain no measure of uncertainty when dealing with a binary outcome - confidence intervals cannot be obtained
Requirements for binomial distribution
- multiple trials
- two outcomes
- p(success) is constant
- trials independent