Statistics Flashcards
What is incidence
Number of new cases emerging in a designated period and population
What is prevalence
Proportion of people in the entire population who are found to be with disease at a certain point in time
What is Sensitivity
True positive (correctly positive)
What is specifity
Correctly negative
What is efficacy
The effect of something under ideal or laboratory conditions
What does western blot detect
Identify proteins
What does northern blot detect
mRNA
What does southern blot detect
DNA
Describe a cohort study
Subjects with a risk factor are recruited.
Two groups are followed up, one with the risk and one without.
Other names for cohort study
Prospective or follow up
Name two observational descriptive studies
Case report and case series
Downsides of cohort studies
Expensive as can be long
Dropout can lead to bias
What is a case control study
Subjects who have the outcome (cases) are matched with those who do not (controls). All subjects are asked about past exposure to risk(s)
Other names for case control studies
Case comparison
Retrospective
Benefits of case control studies
Speedy
Useful when new diseases emerge
Drawbacks of case control
Recruiting matched controls
Rely on recall
What is a cross sectional study
The prevalence of an exposure and an outcome in a population at one point in time
Tools for reviewing effectiveness of cross sectional studies
GRACE, STROBE
What is an experimental study
The researcher intervenes in some way to measure the impact of a treatment
What makes a trial uncontrolled
Same treatment given to everyone
What makes a trial controlled
Subjects are given one of two treatments
What is the gold standard design for studying treatment effects
Randomised control trial
What is a crossover trial
Subjects receive one treatment then switch to another.
Benefits of crossover trial
Can be used to study rare diseases where lack of recruitment could make a trial underpowered.
Subjects are their own control
Downsides of a crossover design
Comparison takes place at two different points in time
What is a factorial trial
Assess the impact of more than one intervention
What is an audit
Aspects of service provision are assessed against a gold standard
What is a systematic review
Access and review of all pertinent articles in a field
What is a meta analysis
Combines results of several studies and produces a quantitative assessment
What is a pilot study
Miniature replica of the proposed trial
What is the benefits of a pilot study
Helps create efficient well designed studies. Increases successes, can show if a trial would be unable to recruit
What happens in phase 0 clinical trials
Microdose humans
What happens in phase 1 clinical trials
Drugs given to healthy people
What happens in phase 2 clinical trials
Treatments given to people with the relevant illness
What happens in phase 3 clinical trials
Treatment given to large groups of people in the clinical setting
What happens in phase 4 clinical trials
Post market surveillance studies.
Data collected about the drug in different populations.
Definition of evidence based medicine
The conscientious, explicit, and judicious use of current best evidence in making decisions about the care of the individual patient
What is internal validity
Does the study answer the question? Do the research methods use work?
What is external validity
Can the study results be used in real life. To what extent?
What is a studies efficacy?
The impact of its intervention under optimum conditions
What is a studies effectiveness
Intended/expected effect under ordinary circumstances
Does efficacy show internal or external validity
Internal
Does effectiveness show internal or external validity
External validity
Benefits of a peer reviewed paper
Forces authors to meet standards.
Can be anonymous
Downsides of peer review
Longer to print
Harder to get controversial opinions published
Possible to guess anonymous authors
What is the primary hypothesis
The hypothesis (often closely related to the clinical Q) which is to be proven true or false.
What is the downside of too much sub group ananlysis
Over comparing data (data dredging) leads to increased false statistical significance.
What must confounders be associated with
The exposure (but cannot be a consequence of)
The outcome, independently
How to deal with confounders
Eliminate
Nullify (spread equally amongst groups)
Account for with stats
What is positive confounding
Association between two variables that are not associated
What is a negative confounder
Masks an association that is present
(Eg exercise cancelling out smoking)
What does an observational descriptive study do
Reports what is seen
What does an observational analytical study do
Reports on similarities and differences between experimental and control groups
What does an experimental study do
Researchers intervene and report differences between experimental and control group
What does a longitudinal study do
Deals with subjects at more than one point in time
What does a cross sectional study do
Snapshot of a group at one point in time
What is a parallel study
Groups receive different interventions and the experiment proceeds
What does a prospective study do
Deals with now and later (looks forward). Data collected as it goes
What does a retrospective study do
Deals with now and the past. Pre existing data collected (cheaper!)
What is an ecological study
Population study. All information at population level.
What is an exploratory study
Ideal setting to see if something works.(New drug, homogenous subjects, placebo use, efficacy data)
Pragmatic study
Ordinary setting, see if something works in real life
(Effectiveness data, often new and old tx compared)
What is an (observational analytical) cohort study
Recruit subjects with a risk factor.investigates exposure to the risk.
Can be long + therefore expensive.
What does an (observational analytical) cross sectional study do
Looks at prevalence of exposure and outcome at one point in time
What bias is created if poor recruitment or allocation techniques are used
Selection bias
Name three main categories for exclusion criteria
Too unwell (ie other serious illness, consent issues etc)
May become unwell (Inc pregnancy)
Confounding factors
Issues with too many exclusion criteria?
Harder to recruit a sample population (risk type II error)
Diagnostic purity bias (results may not be generalisable to gen pop)
Name 5 sampling methods
- Random
- Systematic
- Stratified
- Cluster
- Convenience (easiest, highest risk!)
What is selection bias
When recruitment of a target population that is not representative of the general population
Who introduces sampling bias
The researcher
Who introduces response bias
The study population
Berkson’s bias/admission rate bias
Arises from sample being taken from a hospital setting, but hospital admission does not reflect severity or rate within the community
Diagnostic purity bias
Arises from exclusion of comorbidities.
Complexity of cases may not be reflected
Incidence/prevalence bias (Neymans bias)
Usually due to time gap between onset and selection - severe disease may kill off people leaving only mild disease behind, meaning the data does not contain the sickest individuals.
Membership bias?
Group membership used to recruit - the group may not be representative.
Historical control bias
When subjects and controls are chosen across time, changing definitions may mean the comparison doesn’t work
Resp ones bias occurs when?
Individuals volunteering for studies may differ from general pop - IE more health contieous
Other than at recruitment, when else can selection bias occur
At allocation to trial arms - should be blind!
Who uses cluster randomisation
Public health (primary care sometimes too)
Does true randomisation or adaptive minimisation work better in small studies
Adaptive minimisation as it allows better matched groups
What is concealed allocation
The researchers cannot predict with any accuracy which group the next recruit will be placed in
Difference between concealed allocation and blinding
Concealed allocation is being unaware which group someone enters at recruitment
Blinding is being unaware what treatment is received.
What is propensity score matching
When randomisation can’t be used, a propensity score is calculated per entrant and a match found before recruiting. No match no recruit.
What is publication bias
Research that does find a difference between two groups is more likely to be published than research that doesn’t.
What increases the placebo effect
Larger pills
More pills
Capsules over tablets
What is interviewer bias
A non blinded researcher may ask questions differently if they know what group someone is in
Response bias
The subjects answer questions the way they think the researcher wants the answer, rather than with their true beliefs
What is the Hawthorne effect
Subjects alter their behaviour because there being watched
What is recall bias
What is remembered may be selective not full truth
Name 4 observation biases
Interview bias
Response bias
Hawthorne effect
Recall bias
What is triple blinding
The analyst processing the results is also blinded
What is a clinical endpoint
A measure of a direct clinical outcome such as morbidity, mortality or survival
What is a surrogate endpoint
A biomarker substitute for a clinical endpoint (such as LDL reduction)
What is validity
The extent to which a test measures what it is supposed to measure
What is reliability
How consistent a test is on repeated measurements
When does a test have incremental validity
When it helps more than if it were not used
What is a systemic measurement error
A consistent error made in a series of repetitive tests
Eg a calibration error
What do systematic errors reduce
Accuracy
What is a random error
A variable error that occurs in a set of repeated tests
What does a random error reduce
Precision
What are random errors and systematic errors examples of
Reliability
What is a good cirelation coefficient in a test-retest reliability
> 0.7-0.8
What is parallel form reliability
Repeating the test with an equivalent alternate test
What is a parameter
Any numerical quantity that standardises a population
What is a variable
Ant entity which can take on different values (such as gender, drug dose)
What is an independent variable
The variable manipulated in the study
What is a dependent variable
What is affected by the change in the independent variable - also known as outcome variable
What is a descriptive statistic
Summary of data from the sample population
What is an inferential statistic
Using sample population data to make generalisations about the target population
What is incidence
The rate of occurrence of new cases over a period of time in a defined population
What is mortality rate
Risk of death over a period of time in a given population
What is morbidity rate
The rate of occurrence of new non fatal cases of a disease in a defined population over a given length of time
What is morbidity rate
The rate of occurrence of new non fatal cases of a disease in a defined population over a given length of time
What is a standardised morbidity/mortality rate
When the rate is adjusted for a confounder
What is point prevalence
The proportion of a defined population having the disease at any given point in time
What is period prevalence
The proportion of a population that has the disease over a given timeframe
Relationship between incidence and prevalence
Prevalence = incidence x disease average timeframe
What is lifetime prevalence
Proportion of the population that either has or has had the disease
What is qualitative data
Categorical or non numerical data - eg gender, colour
What is quantitative data
Numerical data
Can either be discrete (counts) or continuous
What is a nominal scale
Organised categories with no relationships to each other (ie hair colour)
What is an ordinal scale
Categories with an inherent rank, but no number value so the gap between categories is meaningless
What is interval scales
Organised in a meaningful way, with differences on points being equal accros the scale. DOES NOT start at 0.
What is a ratio scale
Differences between points equal across the scale but does have a true zero (zero is the start IE Kelvin)
What scales use parametric tests
Nominal and ordinal scales
What scales use non parametric tests
Interval and ratio
Does a normal distribution have the same mean, median and mode
Yes (bell curve)
What are binomial and poisons distributions?
For use with discrete numbers (IE coin toss, baby born)
What is mode
Most common value
What is frequency
Number of values in each category
What is a gaussian distribution
A normal distribution (bell curve)
What are the axis on the bell curve
X = variable (IE grade)
Y = frequency
What is the mean
The sum of all the values divided by number of values
What is variance
The average distance by which each individual observation differs from the mean
What is standard deviation
The degree of data spread about the mean
What does standard deviation measure
Precision
How many observations are included in 1,2,3 standard deviations?
68%
95%
>99%
What is a z score
Converts the value of an observation into the number of standard deviations the observation lies from the mean of the distribution
What is coefficient or variant for
Comparing studies using different units
What is a standard normal distribution used for
A special case where the mean is 0 and the SD is 1 and area under the curve is 1. Used for comparisons of different means by showing on the same scale
Easy ways to tell if a distribution is normsl
Plotted it looks bell shaped
Mean median and mode are the same
What is the median
The middle number in the data
What is the range
Difference between lowest and highest set
What is interquartile range
Used in non-normal data, looks at middle 50%
Benefits of interquartile range over range
Not influenced by outliers
What is a coefficient of skewness
A measure of symmetry
What is a positively skewed distribution
Extended tail to the right, mean is larger than median is larger than mode
What is central limit theorem
The mean of a large number of random variables is distributed approx. Normally.
What is the standard error of the mean
Often just standard error, it is the standard deviation of the sample means.
What is a 95% confidence interval
A range in which we are 95% sure the true population results lie.
What does it mean if a confidence interval includes zero
The results are statistically insignificant
(Ie 3cm +/- 4cm)
What does a relative risk value of 1 mean?
The results are statistically insignificant
What is the positive square root of the variance
The standard deviation (standard variance)
What is the positive square root of the variance
The standard deviation (standard variance)
What sort of data can a t-test be used for
Normal distribution data ONLY
What is parametric data
Normally distributed data
What is non-parametric data
Non-normal distribution data
What is a per-protocol analysis
Only data from those who complied with trial protocol through to completion are considered
What bias can be included with use of a per-protocol analysis
Attrition/exclusion bias
What is the issue with attrition bias
People may drop out due to intolerable side effects
A drug may look more effective than it actually is
What is an intention to treat analysis
All subjects are included in the analysis, regardless of whether they completed the study
What is risk
The probability of something happening
What is odds
Another way of expressing chance
What is absolute risk
Incidence rate of the outcome
What is the absolute risk reduction
The drop between the at risk group and the control group
Whst does and odds ratio of 1 reflect
No effect
What does a log odds ratio of zero mean
No effect
What is the null hypothesis
That any difference is due to chance
What is the alpha level
How rare results would have to be that it is unlikely to be explained by chance( the null hypothesis)
What number is alpha usually set to
0.05 (or 5%)
What number is alpha usually set to
0.05 (or 5%)
What is a p-value
P value expresses the probability of getting a result by chance
When can the null hypothesis be rejected
When the p value is so small it is below the alpha value - the results are statistically significant
What is a type 1 error
A false positive result (wrongful rejection)
What is a type 2 error
A false negative - a wrongful acceptance
What contributes to type 1 errors
(false positives)
Bias, confounding, multiple hypothesis testing (data dredging)
What can cause type 2 errors
(false negatives)
Sample size too small
What type of error should you consider when accepting the null hypothesis
Type 2 error (false negatives)
What type of error should you consider when rejection Ng the null hypothesis
Type 1 error (false positive)
What is power
The probability that a type 2 error will not be made
What is the range for power
0-1
What power is generally accepted as being adequate?
0.8
What does a power of 0.8 mean?
There is an 80% probability of finding a significant difference with a given sample size, if the difference exists
What is the risk of a type 2 error in a study with a power of 0.8
20%
What increases power
Decreased variability
Larger sample sizes
What is a two tailed test
Results of interest can go in either direction
When should a one tailed test be ysed
When a result can only go in one direction
What does bonferroni’s correction do
Safeguard against multiple tests of statistical significance which might give a a false significance
Is statistical significance the same as clinical significance
No.
Clinical significance assessed whether treatment affects are worth it in real life
What is a confidence interval
The range of values around a summary statistics that we are 95% sure the population summary statistic lies
When comparing two groups, if the confidence interval doesn’t overlap this is significant result
How to calculate bonferroni’s correction
P value divided by number of analyses
What is the absolute risk reduction
The difference in event rate between the treatment group and the control group
What is paired data
Data from the same individual at different points in time
What tests can you use with categorical data (contingency tables)
X*2 and McNemars
What is another name for non-parametric data
Distribution free statistics
What is a useful comparison test for parametric data
T-test
How to avoid a type 1 error when using student t-test multiple times
Use ANOVA instead
What does a low fragility index mean
A less robust trial outcome
What is correlation
The strength of a relationship between two variables
Explain correlation coefficient R
R=0 means no relationship
R between -1 and 1
Positive r positive relationship
What is regression
Regression expresses the relationship between variables
What is a superiority trial
Head to head drug trial - new tx better than old?
What is Sensitivity
True positives
What is specifity
True negatives
What is the positive predictive result
The proportion of people who have a positive test who do have the disorder.
(True positive/ true positive + false negatives)
What is the accuracy of a test
The proportion of people with the correct result
Spin and snout?
In SPecific tests, Positive results rule IN the disorder.
In SeNsitive tests, Negative results rule OUT the disorder
What is the median survival time
The time from the start of the study that coincides with a 50% probability of survival
What does a forest plot provide
Visual evidence of heterogeneity (if present)
What sort of paper is a forest plot used in
Meta analysis
What is a reporting bias
Group of biases that can lead to over representation of significant or positive studies in a meta-analysis
What study type is most likely to be missed from a systematic review?
Small study with negative findings as least likely to be published.
This gives rise to PUBLICATION BIAS
How to calculate odds ratio
(A/c)/(b/d)
(A is positive case in group one over C negative cases in group 1)
Divided by
(Positive case in group 2 over negative cases in group 2)
What is the best test to use if there are two variables to be compared
Pearson’s coefficient
What is the standard error of the mean
The accuracy of which a sample measures a population.
Divide the standard deviation by the square root of the number of patients in the study
Negative predictive value?
The chance you don’t have the tests if you test negative
So true negatives / true + false negatives
What is a chi square test used for
Used to compare proportions or percentages
What sort of study is the odds ratio associated with
Case control study
What does a risk ratio less than. 1 mean
The measured exposure is protective of the outcome
That does a risk ratio >1 mean
Measured exposure increases risk of outcome
What is the NNT?
1/absolute risk reduction
How to calculate the absolute risk reduction?
Risk in group A - risk in group B
How is absolute risk calculated?
Number of events in a group/individuals in a group
How to calculate the confidence interval
Mean +/- (1.96x standard error of the mean)