Medical Statistics Flashcards
Standard deviation
Standard error
Standard deviation - describes the spread of data around the mean
Standard error - standard deviation of the sample mean
Confidence interval (CI)
Measures the uncertainty in measurement .
Confidence interval gives the range in which the true mean value is likely to be.
95% CI = range in which 95% of the population lies.
0 is not significant eg 95% CI -5 to +30 change in blood pressure after anti hypertensives means that more than 5% chance that there is no change in BP.
The size of CI is related to sample size - larger samples have smaller CI ( smaller range)
Prevalence
Incidence
Prevalence - the proportion of population with the disease in a time point
Incidence - the rate of new onset disease during a period of time
Odds
Odds ratio
Odds - the number of times an event is likely to occur / the number of times it is unlikely to occur
Odds ratio - odds of the disorder in the experimental group / odds of the disorder in the controlled group
Risk Risk ratio (RR)
Risk is the probability that the event will happen. n
Risk = the number if events that is likely to happen/ total number of events
Risk ratio = risk of an event in experimental group / risk of event in control group
What is the difference between standard deviation and confidence interval?
Standard deviation tells us about the spread (variability) of the data in a sample and the CI tells us the range in which the true value ( the mean if the sample is infinitely large) is likely to be.
What is the P - value?
P value is the probability that the result is due to chance or probability that the results given a true null hypothesis.
P = 0.05 means that the difference in result happening by chance is 1 in 20
Threshold of statically significance.
What is the difference between statically significant and clinical relevance?
If a study is too small, the results are unlikely to be statically significant even if the intervention actually works.
Large studies may find a statically significant difference that is too small ti have any clinical significance/relevance.
Number to treat
Number of patients required to be treated for 1 patient to gain a benefit.
Type 1 error
Type 2 error
Type 1 error - false positive
rejecting the null hypothesis when it is true
due to bias and confounding factors
Type 2 error - false negative
accepting the null hypothesis when it is false
due to small sample size
Intention to treat analysis
To include ALL the participants data regardless on whether they finished the study. This decrease attrition bias.
Drop outs increase Type 1 and 2 error.
Sensitivity
Specificity
Sensitivity - true positive
Patients who has the disease and is tested positive
Specificity - true negative
Patients who does not have the disease and is tested negative
What is the POWER of a study?
Power of the study is the ability of the study to find the difference between the arms.
Power of 0.8 - 80% chance for the study to find a difference.
Larger the sample size and larger the power and smaller type 2 error
Power = 1 - Type 2 error ( usually 0.2)
What is a parametric test?
Parametric test are used to compare samples of normal distribution.
(samples that follow a specific distribution)
eg ANOVA and students T test
What is ANOVA?
This is used to compare the means of 2 or more samples to see whether they come from the same population.
( used in 2 or more samples + parametric)
What is student T test?
This is used to compare 1-2 samples. They test the probability that the samples come from the same population with the same mean.
( 1-2 samples and parametric)
Tell me a few types of bias.
Selection bias - sampling bias and response bias
Observation bias - failure to measure factors/outcomes accurately
Attrition bias - number of drop outs differs in separate arms, those who are left are not representative of the original study
Confounding factors - associated with exposure and outcome but independent. See stratification
What is stratification?
This is way of randomisation - by separating the confounding factors evenly in groups before randomising the groups.
Other methods of randomisation - simple, block and stratified
Tell me 5 points of cohort study.
- observation study on a group of ppl without the disease over a period of time to establish the incidence of the condition
- can be prospective/ retrospective
- can be more influence by confounding factor/ more vulnerable to bias
- good at looking at risk factors
- different study design to others - as it has risk factors then developed outcome
Advantage and disadvantage in cohort
Advantage
1 good at looking at risk factors, latent periods and uncommon disease
2 multiple outcomes ( the million women study)
Disadvantage
1 need large sample size
2 long follow up period
3 may be lost to drop outs
What is CONSORT statement?
1 consolidated standards of reporting tools
2 1996 , last one 2010, comprises 25 item checklist and Consort diagram
3 guide researchers to writing RCT and help evaluate such studies
What is PRISMA statement?
1 preferred reporting items of systemic review and meta analysis
2 guide researchers to write systemic reviews and meta analysis
3 item checklist and flow diagram
What is an impact factor (IF)?
- It is an objective measure of a journal’s relative importance in its field.
- Calculated each year based on a 3 year period
- published in journal of citation reports.
- Average number of times published papers are cited in the last 3 years up to 2 years after publication.
Levels of evidence - Tell me about Oxford Taxonomy
1a Systemic review of RCT 1b High quality RCT 1c RCT 2a Systemic review of cohort studies 2b Cohort study 2c Outcome research 3a Systemic review of case control studies 3b Case control study 4 Case series 5 Expert opinion
Mode
Median
Mean
Mode - most common value of the data set
Median - middle value of the data set
Mean - sum of all values / number of values in the data set
What is a meta-analysis?
A meta-analysis is the use of statistical methods to combine results of individual studies, usually RCTs.
It allows researchers to have a more precise estimation of treatment effect.
What is a systemic review?
This is a secondary research study in which a literature review is conducted according to a strict protocol to identify all studies on the subject.
( Need to conduct under PRISMA guidelines.
Strict inclusion and exclusion criteria to ensure decreased bias.)
What are the advantages and disadvantages of meta-analysis?
Advantage
- Greater statistical power to detect an effect than a single study
- Combine results from several studies hence less likely to be influenced by local factors.
Disadvantages
- pooling results from several studies may not predict the results of a single large study.
- if the source is poorly designed, the meta analysis will produce unreliable results.
What is a Forest plot?
- It is used to demonstrate the results of meta analysis
- vertical line of no effect
- Diamond = point estimate of pooled result
- Horizontal line - 95% CI; if CI crosses the vertical line of no effect, either sample size to too small or the result is not statically significant.
What is a Randomised control trial?
- Allows comparison of 2 or more possible interventions
- Participants are randomly divided to separate arms
Advantage
Minimise the risk of allocation bias
Can be combined into systemic reviews/meta analysis to produce high level evidence
Disadvantage
Time and cost
Difficult on long term FU
Usually focus on a narrow area of patients condition
What are the different phases of RCT?
- Healthy people
- Relevant illness
- Clinical setting
- Postmarketing surveillance
What is allocation concealment?
A method used to prevent clinicians working out the trial allocation of the next participant - including sealed envelopes, central randomisation.
(Blinding - the clinician, patient and outcome assessors are unaware on which intervention)
What is regression analysis?
It aims to model the relationship between multiple variables.
It is usually used to identify and control the effects of possible confounding variables in an experiment.
( in order for it to be adequately powered, it need at least 10 patients for each variable in the model : if there are 10 variables, need at least 1000 pts)
What are ROC curves - receiver operating characteristic curves?
It shows the performance of a test.
Sensitivity Y, Specificity X
Upper left corner - best test
If its a diagonal line - the test has no significance.
The area of the graph - summarise the performance of the test.
What is survival analysis?
It compares time to event data.
Uses Kaplan Meier curve - shows event rates over a time period
( uses a log rank test)
Censoring - accounts for missing data