statistics Flashcards
How do you calculate incidence percentage? How do you calculate incidence per N
Usually reported over a time period
– E.g. 1% per year or 1 in 100 per year
– 2 per 100,000 per year = 0.002% per year
* (new cases / population)*100 = %age
* (new cases / population) *n = rate per n
– E.g. (new cases / population) *100,000 = rate
per 100,000
Prevalence
-What is this
-How do you calculate prevalence at one point in time
-How do you calculate prevalence over an interval of time
Proportion of population with a disease:
– at a point in time (point prevalence)
– over an interval of time (period prevalence)
* Calculation of point prevalence
– (all cases/population)100 = %age
* Calculation of period prevalence
– (all cases during a give time / population over
that period)100 = %age
Relationship
prevalence = incidence * duration of condition in chronic diseases the prevalence is much greater than the incidence in acute diseases the prevalence and incidence are similar. For conditions such as the common cold the incidence may be greater than the prevalence
P values
-What does a P value of 0.1 mean?
-What does a p value of 0.01 mean?
- P = 0.1 = 1 in 10 probability due to chance
- P = 0.01 = 1 in 100 due to chance
– “highly statistically significant” - P = 0.05 = 1 in 20 due to chance
– often arbitrarily set as statistically significant - P = 0.001 = 1 in 1000 due to chance
– “very highly statistically significant” - P value does not tell us ANYTHING about
CLINICAL significance
What is null hypothesis? when is this rejected? when is this accepted?
Null Hypothesis – H0
* Null hypothesis – that there is no
difference between groups
* If p value statistically significant, we
REJECT the null hypothesis and accept
the finding as genuine
* If p value is NOT statistically significant,
we ACCEPT the null hypothesis
What is type 1 error vs type 2 error
Type 1 error is a false positive
– Positive result actually due to chance
– False rejection of null hypothesis
* Type 2 error is a false negative
– Negative result actually due to error
– False acceptance of null hypothesis
What is a confidence interval?
Give us an idea where the TRUE
population value lies rather than the
observed value
* Expressed as a range
* Usually interested in 95% CI
* This means that there is a 95% chance the
TRUE value falls within this range
* As sample size increases, 95% CI range
decreases
What are properties of normal distribution?
Properties of the Normal distribution
symmetrical i.e. Mean = mode = median 68.3% of values lie within 1 SD of the mean 95.4% of values lie within 2 SD of the mean 99.7% of values lie within 3 SD of the mean this is often reversed, so that within 1.96 SD of the mean lie 95% of the sample values the range of the mean - (1.96 *SD) to the mean + (1.96 * SD) is called the 95% confidence interval, i.e. If a repeat sample of 100 observations are taken from the same group 95 of them would be expected to lie in that range
Standard deviation
the standard deviation (SD) is a measure of how much dispersion exists from the mean SD = square root (variance)
How do you calculate sensitivity?
Sensitivity
TP / (TP + FN ) Proportion of patients with the condition who have a positive test result
How do you calculate specificity?
Specificity
TN / (TN + FP) Proportion of patients without the condition who have a negative test result
what is a positive predictive value?
Positive predictive value
TP / (TP + FP)
The chance that the patient has the condition if the diagnostic test is positive
What is a negative predictive value?
Negative predictive value
TN / (TN + FN)
The chance that the patient does not have the condition if the diagnostic test is negative
How do you calculate likelihood ratio for positive test result?
Likelihood ratio for a positive test result sensitivity / (1 - specificity)
How much the odds of the disease increase when a test is positive
How do you calculate Likelihood ratio for a negative test result?
Likelihood ratio for a negative test result (1 - sensitivity) / specificity How much the odds of the disease decrease when a test is negative
Describe the 4 phases of a study
Phase 1
small studies (e.g. 100) on healthy volunteers used to assess pharmacodynamics and pharmacokinetics
Phase 2
small studies (e.g. 100-300) on actual patients examines efficacy, adverse effects
Phase 3
larger studies (e.g. 500-5,000 patients) examines efficacy, adverse effects may compare drug with existing treatments studies of special groups e.g. renal, elderly
If drug shown to be safe and effective then drugs may be approved for marketing
Phase 4
post-marketing surveillance
What is the hawthorne effect?
Hawthorne effect - describes a group changing it’s behaviour due to the knowledge that it is being studied
What is selection bias?
Error in assigning individuals to groups leading to differences which may influence the outcome. Subtypes include sampling bias where the subjects are not representative of the population. This may be due to volunteer bias. An example of volunteer bias would be a study looking at the prevalence of Chlamydia in the student population. Students who are at risk of Chlamydia may be more, or less, likely to participate in the study. A similar concept is non-responder bias. If a survey on dietary habits was sent out in the post to random households it is likely that the people who didn’t respond would have poorer diets than those who did.
Other examples include
loss to follow up bias prevalence/incidence bias (Neyman bias): when a study is investigating a condition that is characterised by early fatalities or silent cases. It results from missed cases being omitted from calculations admission bias (Berkson's bias): cases and controls in a hospital case control study are systematically different from one another because the combination of exposure to risk and occurrence of disease increases the likelihood of being admitted to the hospital healthy worker effect
What is recall bias?
Difference in the accuracy of the recollections retrieved by study participants, possibly due to whether they have disorder or not. E.g. a patient with lung cancer may search their memories more thoroughly for a history of asbestos exposure than someone in the control group. A particular problem in case-control studies.
What is publication bias?
Failure to publish results from valid studies, often as they showed a negative or uninteresting result. Important in meta-analyses where studies showing negative results may be excluded.
What is work-up / verification bias?
In studies which compare new diagnostic tests with gold standard tests, work-up bias can be an issue. Sometimes clinicians may be reluctant to order the gold standard test unless the new test is positive, as the gold standard test may be invasive (e.g. tissue biopsy). This approach can seriously distort the results of a study, and alter values such as specificity and sensitivity. Sometimes work-up bias cannot be avoided, in these cases it must be adjusted for by the researchers.
What is expectation bias? pygmalion bias?
Only a problem in non-blinded trials. Observers may subconsciously measure or report data in a way that favours the expected study outcome.
What is late look bias?
Gathering information at an inappropriate time e.g. studying a fatal disease many years later when some of the patients may have died already
What is procedure bias?
Occurs when subjects in different groups receive different treatment
Lead-time bias
Occurs when two tests for a disease are compared, the new test diagnoses the disease earlier, but there is no effect on the outcome of the disease
How do you calculate standard error of the mean? how do you calculate upper limit and lower limit of a 95% confidence interval?
The standard error of the mean (SEM) is a measure of the spread expected for the mean of the observations - i.e. how ‘accurate’ the calculated sample mean is from the true population mean
Key point
SEM = SD / square root (n) where SD = standard deviation and n = sample size therefore the SEM gets smaller as the sample size (n) increases
A 95% confidence interval:
lower limit = mean - (1.96 * SEM) upper limit = mean + (1.96 * SEM)
What is the power of a study?
The power of a study is the probability of (correctly) rejecting the null hypothesis when it is false, i.e. the probability of detecting a statistically significant difference
power = 1 - the probability of a type II error power can be increased by increasing the sample size
what may limit use of a randomised control trial?
practical or ethical problems
what is the usual outcome of a cohort study?
relative risk
what is the usual outcome of a case-control study?
odds ratio
Cross-sectional surveys - AKA, what does this provide?
Provide a ‘snapshot’, sometimes called prevalence studies
Provide weak evidence of cause and effect