Stats and CAT stuff Flashcards
What is a 95% confidence interval?
The range of values that are 95% certain to contain the true value
What is standard deviation?
A value that shows how much variation there is from the mean
What is a P-value, & what is the cut off?
- A value that shows how likely it is that results are due to chance
- P>0.05 is not statistically significant
What is power?
The probability of correctly reject the null hypothesis when it is false
What is the purpose of blinding?
- To reduce conscious or subconscious bias
What is concealment of allocation, and what bias does it reduce?
- When someone running a trial doesn’t know what group a participant will be put in
- Reduce selection bias
What is intention to treat analysis, and give an example of when it may be used?
- Analysis based the initial intended intervention rather than what was ultimately given
- Can be used when someone drops out of a study
What is treatment fidelity?
How well an intervention is reproduced from a protocol or animal model
What is the purpose of randomisation?
- To equally distribute possible confounding factors between different study groups
- Reduces selection bias
What is internal validity?
- Accuracy
- How well a study is conducted, taking into account confounders and removing bias
What is external validity?
- Generalisability
- How well can the results of a study be applied to different patients, situations and environment in the real world
What is a type 1 error?
- Rejecting the null hypothesis when it is true
- False positive
What is a type 2 error?
- Accepting the null hypothesis when it is false
- False negative
What is absolute risk?
The risk of disease/death in the population being studied
What is relative risk?
The risk of disease/death in the exposed, compared to that in the non-exposed
What is the risk ratio?
Probability of disease/death in the at-risk group/that of the not-at-risk group
What is the odds ratio?
Ratio of odds of disease/death in exposed, compared to that of the non-exposed
What is absolute risk reduction?
Rate of disease/death in the unexposed minus the rate in the exposed
What is the formula for absolute risk reduction?
CER-EER
What is attributable risk?
- Rate of disease/death in exposed minus the rate in unexposed
- (opposite of absolute ARR)
- (think of it as the actual risk that can be ATTRIBUTED to a risk factor by removing the natural rate of developing a disease in the unexposed)
What is the formula for attributable risk?
EER-CER
What is a hazard ratio, and when might it be used?
- Similar to relative risk, but when a risk isn’t constant to time
- E.g. when looking at survival time
What is numbers needed to treat, and how should you deal with a decimal NNT figure?
- The number of people required to receive an intervention to produce 1 positive outcome
- Always round up
What is numbers needed to harm, and how should you deal with a decimal NNH figure?
- The number of people required to receive an intervention to produce 1 adverse outcome
- Always round down
What is the formula for NNT?
NNT = 1/ARR = 1/CER-EER
What is the formula for NNH?
NNH = 1/attributable risk = 1/EER-CER
What is the definition of a P-value?
The probability of receiving a result by chance that is at least as extreme as the true value, assuming the null hypothesis is true
What does it mean if the P-value of a study that found a new chemo drug to be 50% more effective than cisplatin was 0.1, and what would you do with this result?
- That there is a 10% probability that the new drug will improve chemotherapy outcomes by 50% due to chance alone
- Therefore we ignore the result and accept the null hypothesis
How are type 1 errors affected by sample size?
They aren’t
How are type 1 errors affected by the number of possible outcomes/end-points?
- Type 1 errors increase with more possible outcomes/end-points
- (More likely one will occur due to chance, therefore more likely to have false positive)
How are type 2 errors affected by sample size?
- Type 2 errors are decreased by a larger sample size
- (occur when study doesn’t have much power)
- (larger sample means more likely difference is seen, therefore reducing likelihood of false negative)
What is the formula for power?
1 - probability of type 2 error
How is power affected by sample size?
Power increases with increased sample size
What is sensitivity?
- The proportion of people with a disease who get a positive test result
- “how good is a test at detecting disease?”
What is specificity?
- The proportion of people without a disease who receive a negative result
- “how good is a test at excluding disease?”
What is the formula for sensitivity?
TP/(TP+FN)
What is the formula for specificity?
TN/(TN+FP)
What is positive predictive value?
The chance that a patient has a disease if a test is positive
What is negative predictive value?
The chance that a patient doesn’t have a disease if a test is negative
What is the formula of PPV?
TP/(TP+FP)
What is the formula for NPV?
TN/(TN+FN)
What is the likelihood ratio for a positive test?
How much the odds of a disease increase with a positive test result
What is the likelihood ratio for a negative test?
How much the odds of a disease decrease with a negative test
What is the formula for the positive likelihood ratio?
Sensitivity/(1-specificity)
What is the formula for for the negative likelihood ratio?
(1-sensitivity)/specificty
How are PPV and NPV affected by prevalence?
- Increased prevalence increases PPV and decreases NPV
- Vice versa for decreased prevalence
How are likelihood ratios affected by prevalence?
They aren’t
What does the correlation coefficient indicate?
- How closely the plotted points lie to a line drawn thought the plotted data
- Essentially the strength of correlation between 2 variables
What does the correlation coefficient not tell you?
How much one variable will change in relation to another
What its the correlation coefficient denoted by, and what is the range of possible values?
- Denoted by the r-value
- -1≤r≤1
What can linear regression be used to predict?
How much one variable will change in relation to another being changed
What is an equation for linear regression?
- y = mx + c
- Essentially the equation for a line
What do the letters stand for in the regression equation?
- y: the variable being studied
- m: the gradient of the line
- x: the variable being changed
- c: the y-intercept (y value when x=0)
What is odds?
- The ratio of the number of people who develop an outcome compared to those who don’t
- Can be >1
What is risk?
- The ratio of the number of people who develop an outcome compared to the total number people
- Remember risk=probability
- Must be <1
When are odds often used?
- Case-control studies
What is variance?
The average of the squared differences from the mean
What is standard deviation?
The square root of the variance
What is the standard error of the mean, and how is it affected by sample size?
- SD/square root of sample size (n)
- SEM gets smaller as sample size increases
How do you calculate the lower limit of a confidence interval?
Mean - (1.96 x SEM)
How do you calculate upper limits of a confidence interval?
Mean + (1.96 x SEM)
What is parametric data?
Data that is normally distributed in a bell shape curve
What is paired data, and give an example?
- Comes from a single group of participants
- E.g. measurement before and after intervention
What is non-paired data, and give an example?
- Comes from 2 groups of participants
- E.g. 2 groups receiving different interventions
What statistical test can be used for paired and unpaired parametric data?
Student’s T-test
What statistical test can be used to show correlation in parametric data?
Pearson’s product moment coefficient
What statistical test can be used for paired non-parametric data?
Wilcoxon signed rank test
What statistical test can be used for unpaired non-parametric data?
Mann-Whitney U test
What statistical test can be used to compare proportions or percentages in non-parametric data?
Chi-squared test
What statistical test can be used to show correlation in non-parametric data?
Spearman’s/Kendall’s rank test
What is Phase 1 of a clinical trial?
- Determines pharmacokinetics and pharmacodynamics, and side effects
- Tested on healthy volunteers
What is Phase 2 of a clinical trial?
- Assesses dosage (2a) and efficacy (2b)
- Tested on small number of affected individuals
What’s Phase 3 of a clinical trial?
- Assesses effectiveness
- Tested on 100-1000s for affected individuals
What is Phase 4 of a clinical trial?
- Postmarket surveillance
- Monitors for long term side effects
Describe the characteristics of averages in positively skewed data?
- Mean > median > mode
- Same order for negatively skewed but with < sign
What is a box and whisker plot?
A graphical representation of the sample minimum value, lower quartile, median, upper quartile, and sample maximum value for a set of data
What is a funnel plot used for?
To demonstrate the existence of publication bias in meta-analyses
What funnel plot shape suggests no publication bias?
An inverted symmetrical funnel
An asymmetrical funnel plot suggests what?
- A relationship between study size and treatment effect
- Either publication bias, or a systematic difference between smaller and larger studies
What is a histogram?
A graphical display of continuous data that has been put into categories
Where are forest plots usually found?
In meta-analyses
What do forest plots show?
The strength of evidence of the constituent trials in a meta-analysis
What do the squares in a forest plot show?
- Squares are centred on the result of the trial
- The size of each square represents the study’s weight in the meta-analysis
What does the line running through the middle of the squares on a forest plot show?
- The confidence intervals
- Usually 95%
What does the diamond at the bottom of a forest plot show?
- Diamond is centred on the mean of all the studies
- The lateral edges show the confidence intervals
- If the confidence intervals cross the line of no effect, the meta-analysis shows the interventions effect isn’t significant
What is a scatter plot?
- A graphical representation using cartesian coordinates to display values for 2 variables for a set of data
What is a Kaplan-Meier survival plot?
Plot of the Kaplan-Meier estimate of survival function
- Shows decreased survival over time
What 2 graphs are you likely to find in a meta-analysis?
- Forest plot
- Funnel plot
Was is selection bias?
- An error in assigning individuals to groups leading to differences between the groups that may affect the outcome of the study
- Subjects are not representative of the population
What are the 3 types of selection bias?
- Sampling bias
- Volunteer bias
- Non-responder bias
When is selection bias a particular problem?
In cohort studies
What is recall bias?
- Difference in the accuracy of recollection of study participants
- May be due to whether they have an outcome or not
What is an example of recall bias?
People with mesothelioma try harder to remember asbestos exposure than those without
When is recall bias a particular problem?
In case-control studies
What is publication bias?
Failure to publish/include results from valid studies due to them having negative or uninteresting results
When is publication bias seen?
In systematic reviews and meta-analyses
What is work-up bias?
Refers to a gold-standard test being performed more frequently in patients who already have a positive result from a new test
When is work-up bias seen?
In studies trying to validate a new diagnostic test
What is expectation bias?
When observers subconsciously measure or report data in a way that favours the expected outcome of a study
When is expectation bias seen?
Non-blinded trials
What is the Hawthorne effect?
When a study group changes their behaviour due to knowledge they are being studied
What is length time bias?
When screening over represents less aggressive disease
What is an example of length time bias?
Less aggressive tumours are often picked up more successfully by screening, but already have better outcomes, so it looks like screening has improved survival when in fact survival was better anyway
What is lead-time bias?
When early diagnosis appears to prolong survival