Statistics and studies Flashcards
What is selection bias?
Error in assigning individuals to groups leading to differences which may influence the outcome.
What is recall bias?
Difference in the accuracy of the recollections retrieved by study participants, possibly due to whether they have disorder or not.
This is a problem in case control studies
What is publication bias?
Failure to publish results from valid studies, often as they showed a negative or uninteresting result. Important in meta-analyses where studies showing negative results may be excluded.
What is work up bias?
In studies which compare new diagnostic tests with gold standard tests, work-up bias can be an issue. Sometimes clinicians may be reluctant to order the gold standard test unless the new test is positive, as the gold standard test may be invasive (e.g. tissue biopsy).
What is expectation bias?
Only a problem in non-blinded trials. Observers may subconsciously measure or report data in a way that favours the expected study outcome.
What is the Hawthorne effect?
Describes a group changing it’s behaviour due to the knowledge that it is being studied
What is late look bias?
Gathering information at an inappropriate time e.g. studying a fatal disease many years later when some of the patients may have died already
What is procedure bias?
Occurs when subjects in different groups receive different treatment
What is lead time bias?
Occurs when two tests for a disease are compared, the new test diagnoses the disease earlier, but there is no effect on the outcome of the disease
What are the phases of clinical trial?
Phase 1- Determines pharmacokinetics and pharmacodynamics and side-effects prior to larger studies (first in human - normally healthy people - but not in cancer)
Phase 2 Assess efficacy + dosage
- 2a: Dosing
- 2b: Efficacy
Phase 3 - Assess effectiveness - normally an RTC
Phase 4 - post marketing surveillance. Monitors for long-term effectiveness and side-effects
What do confidence intervals show?
a range of values within which the true effect of intervention is likely to lie
What does the standard error of the mean show?
measure of the spread expected for the mean of the observations - i.e. how ‘accurate’ the calculated sample mean is from the true population mean
How is the standard error of the mean calculated?
SEM = SD / square root (n)
where SD = standard deviation and n = sample size
therefore the SEM gets smaller as the sample size (n) increases
How is the lower 95% confidence interval calculated?
lower limit = mean - (1.96 * SEM)
*If the sample is < 100 then need a student critical T test
How is the higher 95% confidence interval calculated?
upper limit = mean + (1.96 * SEM)
*If the sample is < 100 then need a student critical T test
What is a confounder?
In statistics confounding refers to a variable which correlates with other variables within a study leading to spurious results.
What is correlation?
Correlation is used to test for association between variables
What is regression?
Once correlation is demonstrated can do regression
Regression can be used to predict values of other dependent variables from independent variables.
What is nominal data?
Observed values can be put into set categories which have no particular order or hierarchy.
e.g. Birthplace
What is ordinal data?
Observed values can be put into set categories which themselves can be ordered (for example NYHA classification of heart failure symptoms)
What is discrete data?
Observed values are confined to a certain values, usually a finite number of whole numbers (for example the number of asthma exacerbations in a year)
What is continuous data?
Data can take any value with certain range (for example weight)
What is binomial data?
Data may take one of two values (for example gender)
What is a hazard’s ratio?
The hazard ratio (HR) is similar to relative risk but is used when risk is not constant to time. It is typically used when analysing survival over time
What is incidence?
The incidence is the number of new cases per population in a given time period.
What is the prevalence?
The prevalence is the total number of cases per population at a particular point in time. It can be divided into two types:
- point prevalence = number of cases in a defined population / number of people in a defined population at the same time
- period prevalence = number of identified cases during a specified period of time / total number of people in that population
What is intention to treat analysis?
All patient asigned to one arm of an RCT are analysed in that arm
Regardless of completing assigned treatment of not
What is % is in one standard deviation of a normal distribution?
68.3% of values lie within 1 SD of the mean
What is % is in two standard deviation of a normal distribution?
95.4% of values lie within 2 SD of the mean
What is % is in three standard deviation of a normal distribution?
99.7% of values lie within 3 SD of the mean
What is the numbers needed to treat principal?
Average number of patients who require to be treated for one to benefit compared with a control in a clinical trial.
How is the number needed to treat principal calculated?
1/(Absolute risk reduction) and is rounded to the next highest whole number
What is the experimental event rate?
Experimental event rate (EER) = (Number who had particular outcome with the intervention) / (Total number who had the intervention)
What is the control event rate?
Control event rate (CER) = (Number who had a particular outcome with the control/ (Total number who had the control)
What is an odds ratio?
Odds are a ratio of the number of people who incur a particular outcome to the number of people who do not incur the outcome.
odds of / odds not
What is relative risk?
Relative risk (RR) is the ratio of risk in the experimental group (experimental event rate, EER) to risk in the control group (control event rate, CER). The term relative risk ratio is sometimes used instead of relative risk.
To recap
EER = rate at which events occur in the experimental group
CER = rate at which events occur in the control group
What does a relative risk > 1 mean?
risk ratio is > 1 then the rate of an event (in this case experiencing significant pain relief) is increased compared to controls.
What does a relative risk < 1 mean?
risk ratio is > 1 then the rate of an event (in this case experiencing significant pain relief) is increased compared to controls.
How do you calculate the relative risk increase or relative risk reduction?
Relative risk reduction (RRR) or relative risk increase (RRI) is calculated by dividing the absolute risk change by the control event rate
What is reliability?
Reliability is used in statistics to imply consistency of a measure.
What is validity?
Validity is determined by whether a test accurately measures what it is supposed to measure.
What is a true positive?
Disease is present AND Positive test
What is false positive?
Disease absent BUT Positive test
What is a true negative?
Disease not present AND Negative test
What is false negative?
Disease present BUT Negative test
What is specificity?
Proportion of patients without the condition who have a negative test result
True negative / (True negative + False Positive)
What is sensitivity?
Proportion of patient with the condition who have a positive test
True positive / (True positive + False negative)
What is the positive predictor value?
The chance that the patient has the condition if the diagnostic test is positive
True positive / (True positive + False positive)
What is the negative predictor value?
The chance that the patient does not have the condition if the diagnostic test is negative
True negative / (True negative + False negative)
The ten criteria for a screening test. Wilson and Junger criteria?
- The condition should be an important public health problem
- There should be an acceptable treatment for patients with recognised disease
- Facilities for diagnosis and treatment should be available
- There should be a recognised latent or early symptomatic stage
- The natural history of the condition, including its development from latent to declared disease should be adequately understood
- There should be a suitable test or examination
- The test or examination should be acceptable to the population
- There should be agreed policy on whom to treat
- The cost of case-finding (including diagnosis and subsequent treatment of patients) should be economically balanced in relation to the possible expenditure as a whole
- Case-finding should be a continuous process and not a ‘once and for all’ project
What does a null hypothesis state?
A null hypothesis (H0) states that two treatments are equally effective
What is the P value?
Probability of obtaining a result by chance at least as extreme as the one that was actually observed, assuming that the null hypothesis is true.
What is a normal distribution?
Normal (Gaussian) distributions: mean = median = mode
What is a positively skewed distribution (right skewed) ?
Positively skewed distribution: mean > median > mode
What is a negatively skewed distribution (left skewed) ?
Negatively skewed distribution mean < median < mode
What does a funnel plot demonstrate?
demonstrate the existence of publication bias in meta-analyses.
a symmetrical, inverted funnel shape indicates that publication bias is unlikely
an asymmetrical funnel indicates a relationship between treatment effect and study size. This indicates either publication bias or a systematic difference between smaller and larger studies (‘small study effects’)
What determines the type of significant testing you can use?
If data is parametric (somewhat is measured and normally normal distribution)
Or
Non-parametic
When should a student paired t test be done?
Parametric data
paired data refers to data obtained from a single group of patients, e.g. Measurement before and after an intervention.
When should a student unpaired t test be done?
Parametric data
Unpaired data comes from two different groups of patients, e.g. Comparing response to different interventions in two groups
When should a Mann- Whitney U test be done?
Non-parametric
compares ordinal, interval, or ratio scales of unpaired data
When should a Wilcoxon signed-rank test be done?
Non-parametric
compares two sets of observations on a single sample, e.g. a ‘before’ and ‘after’ test on the same population following an intervention
When should a Chi squared test be done?
Non-parametric
used to compare proportions or percentages e.g. compares the percentage of patients who improved following two different interventions
What type of study can be used to estimate the prevalence of a condition?
Cross sectional study
What is the relationship between prevalence of incidence?
Prevalence = incidence X duration of disease
What correlation tests are there?
Parametic: Pearson rank correlation
Non-parametic: Spearman’s rank correlation
What are the three study designs for a new drug?
Superiority
Equivalence
Non-inferiority (only the lower confidence interval needs to lie in range - not the upper)
What is the problem with running a superiority trial?
Large sample size needed
How does a equivalence trial work?
Equivalence margin is defined (-delta to +delta) on a specified outcome. If the confidence interval of the difference between the two drugs lies within the equivalence margin then the drugs may be assumed to have a similar effect
What is a type 1 statistical error?
The null hypothesis (H0) is rejected, when it is true
What is a type 2 statistical error?
The null hypothesis(H0) is accepted, when it is not true
What is the power of the study?
The probability of rejecting the null hypothesis, when the null hypothesis is not true
Power = 1 - type 2 error
What is the advantage of a non-inferiority study?
Low sample size needed
Levels of evidence?
Ia - evidence from meta-analysis of randomised controlled trials
Ib - evidence from at least one randomised controlled trial
IIa - evidence from at least one well designed controlled trial which is not randomised
IIb - evidence from at least one well designed experimental trial
III - evidence from case, correlation and comparative studies
IV - evidence from a panel of experts
How do you calculate absolute risk reduction?
ARR = Control event rate - experiment event rate
How is standard deviation and variance related?
SD = square root of variance
How is standard error of the mean calculated?
Standard error of the mean = standard deviation / square root (number of patients)
How to calculate relative risk reduction?
Relative risk reduction = (EER - CER) / CER
What is the most usual statistical measure in a case cohort study?
Odds ratio
What is numbers needed to treat?
The number needed to treat to prevent one bad outcome
What does the positive predictive value mean?
the chance that the patient has the condition if the diagnostic test is positive
What does the negative predictive value mean?
probability that subjects with a negative screening test don’t have the disease.
What should be calculated with specificity?
Confidence intervals