Stats Flashcards
Likelihood ratio of positive test result
sensitivity / (1-specificity)
Median
middle item in a data set which has been arranged in numerical order
mode
most frequent item in a data set
mean
add all items in data set together and divide by the number of items
Relative risk reduction
ARR / CER
ARR: absolute risk reduction (the difference between the two rates in control and treatment group)
CER: event rate in the control group
What are funnel plots primarily used for?
Assess for potential publication bias in meta-analyses
Graph the size of the effects found in individual studies against a measure of the study’s precision or size
Chi-squared test (4)
Used to assess differences in categorical variables
Non-parametric test
Applies assumption that the sample is large
Compares the observed frequencies against those that would have been expected if there was no difference and then produces a value which can be used to assess if the difference is significant (p<0.05)
Pearson’s correlation coefficient
- Measures linear correlation between 2 variables
- sign of the correlation coefficient tells us the direction of the linear relationship: negative then trend line slopes down, positive then trend line slopes us
- the size/magnitude of the correlation coefficient tells us the strength of a linear relationship: >0.90 = strong, 0.65-0.9 = moderate, <0.65 = weak
- parametric test
- if the data is non-parametric or if both variables are not ratio variables then Spearman’s should be used
The 3 types of t-test
- one sample t-test
- independent t-test
- paired t-test
one sample t-test
- used to see if there is a difference between a sample mean and the hypothesised population mean
independent t-test
- used when you want to compare means from independent groups
paired t-test
- used when comparing the means of two groups that are considered to be paired (matched, or dependent)
ANOVA
- statistical test to demonstrate statistically significant differences between the means of several groups
- similar to a student’s t-test apart from that ANOVA allows the comparison of more than just 2 means
- assumes that the variable is normally distributed
- works by comparing the variance of the means
- distinguishes between within group variance and between group variance
- the null hypothesis assumes that the variance of all the means is the same as between group variance
- the test is based on the ratio of these two variances, known as the F statistic
Relative risk
RR = EER / CER
EER: treatment group risk
CER: control group risk
NNT - number needed to treat
- used in assessing the effectiveness of a healthcare intervention
- represents the average number of patients who need to be treated to prevent one additional bad outcome or produce one additional good outcome
RISK
- a proportion
- probability with which an outcome will occur
- usually expressed as a decimal between 0-1
- often expressed as a number of individuals per 1000
- if risk is 0.1, in a sample of 100 people, the number of events observed will on average be 10
ODDS
- odds is a ratio
- the ratio of the probability that a particular event will occur to the probability that it will not occur
- can be any number 0-infinity
- commonly expressed as a ratio of 2 integers, eg odds of 0.01 would be 1:100
absolute risk
basic risk
in many studies it will just be the incidence rate
in experiments, will be the number of events in that group divided by the number of people in the group
risk difference / absolute risk reduction
the difference between the absolute risk of an event in the intervention group and the absolute risk in the control group
relative risk
the ratio of risk in the intervention group to the risk int he control group
1 = estimated effects are the same for both interventions
used in cohort, cross-sectional and randomised control trials
Positive predictive value (PPV)
the probability that subjects with a positive screening test truly have the disease
PPV = true positives / (true positives + false positives)
sensitivity
how well a test can identify true positives from all actual positives
sensitivity = number of true positives / (true positives + false negatives)
specificity
how accurately a test identified those without a condition/disease
specificity = number of true negatives / (true negatives + false positives)
accuracy
how close measurements are to ‘true values’
negative predictive value (NPV)
likelihood that subjects with a negative screening test truly do not have the disease
NPV = number of true negatives / (true negatives + false negatives)
how to calculate NNT
NNT = 1 / (CER-EER)
or
NNT = 1 / absolute risk reduction
arithmetic mean
adding up all the values and dividing by the number of values
harmonic mean
calculated by dividing the number of observations by the sum of the reciprocal of the value
used when there is a time factor involved eg speed
generalised mean / power mean
involves raising each value to a specific power, adding together, taking average and then taking the root of that average
range
difference between largest and smallest values
interquartile range
aka the mid spread
difference between the 3rd and 1st quartiles
ratio / continuous data
like interval but have true zero points
eg kelvin scale temp
interval data
measurement where the difference between 2 values is meaningful
eg temperature, pH
ordinal data
observed values can be put into set categories which themselves can be ordered
eg social class
nominal data
observed values can be put into set categories which have no particular order or hierarchy.
you can count but not order or measure nominal data
eg birthplace, eye colour
quantitative data
numeric values
can be further classified into discrete and continuous types
qualitative data
not numerical, usually names
AKA categorical or nominal variables
endemic
consistent presence and/or usual prevalence of a disease in a population within a geographical area
epidemic
refers to an increase, often sudden, in the number of cases of a disease above what is normally expected in that population in that area
pandemic
an epidemic that has spread over several countries or continents, usually affecting a large number of people
standard error of the mean
standard deviation / square root (number of patients)
GRADE system
Grading of Recommendations Assessment, Development and Evaluation
rates the quality of evidence in systematic reviews and guidelines
classified as high, moderate, low or very low
internal validity
the confidence that we can place in the cause and effect relationship in a study.
the confidence that we have that the change in the independent variable caused the observed change in the dependent variable (rather than due to poor control of extraneous variables)
external validity
the degree to which the conclusions in the study would hold for other persons in other places and at other times
ie. its ability to generalise
face validity
the general impression of a test
if it appears to test what it is meant to
content validity
the extent to which a test or measure assesses the full content of a subject or area.
criterion validity
concerns the comparison of tests
you may wish to compare a new test to see if it works as well as an old, accepted method
the correlation coefficient is used to test such comparisons
criterion validity (concurrent)
the predictor and criterion data are collected at or about the same time
criterion validity (predictive)
the predictor scores are collected first, and criterion data are collected at later point
want to know if the test predicts future outcomes
construct validity
the extent to which a test measures the construct it aims to
construct validity (convergent)
has convergent validity if it has a high correlation with another test that measures the same construct
construct validity (divergent)
demonstrated through a low correlation with a test that measures a different construct
cost effectiveness analysis (CEA)
compares a number of interventions by relating costs to a single clinical measure of effectiveness
cost effectiveness ratio = total cost / units of effectiveness
combines costs and effects - usually reported as an incremental cost-effectiveness ratio (ICER)
cost benefit analysis (CBA)
technique in which all the costs and benefits of an intervention are measured in terms of money
used to establish which of the alternatives has the greatest net benefit
requires that all the consequences of an intervention, such as life years saves, symptom relieve etc are all allocated a monetary value
cost-utility analysis (CUA)
special form of CEA in which health benefits / outcomes are measured in broader, more generic ways enabling comparisons between treatments for different diseases and conditions
cost minimisation analysis (CMA)
economic evaluation in which consequences of competing interventions are the same and in which only inputs (costs) are takin into consideration
the aim is to decide the least costly way of achieving the same outcome
test-retest reliability
assessed the stability of a measure over time by administering the same test to the same individual on two different occasions
split-half reliability
assesses the internal consistency of a test by dividing it into 2 halves and comparing the results of each half
ensures consistency within the test items, but does not address stability of the tool over time
parallel-forms reliability
involved administering two equivalent forms of a test to the same group and comparing results.
valuable for avoiding practice effects
internal consistence reliability
measures how consistently items within a test measure the same construct, often using statistical methods like Cronbach’s alpha.
does not assess stability over time
inter-rater reliability
assesses the consistency of scores when different raters or observers administer the test
critical in situations where multiple clinicians assess the same patient, but not relevant to determining whether tool yields stable results for the same individual across repeated administrations
forrest plot weighting
indicated influence an individual study has on pooled result
generally, bigger sample size AND narrower confidence interval, the higher the weight
shown by larger box
heterogeneity in forrest plots
refers to variability between studies and can affect the ability to combine the data of the individual studies
clinical heterogeneity
variability caused by differences in clinical variables, eg patient population, interventions etc
clinicians determine clinical heterogeneity - subjective
statistical heterogeneity
the variability in effect estimates between the studies and can be quantified by various statistics
forrest plots only present the statistical heterogeneity
denominator for simple variance
n-1
Berkson’s bias
occurs when the selection of participants for a study is influenced by their likelihood to seek healthcare
may not be representative of the general population, and can lead to an overestimation of the association between diseases
observer bias
aka information of measurement bias
systematic differences in the way data is collected for different groups
could be due to the observer’s knowledge about participant’s exposure status influencing how they measure outcome variables
hawthorne effect
changes in behaviour that occur when individuals know they are being observed
verification bias
aka referral or test review bias
happens when subjects with positive results are more likely to have their test results confirmed than those with negative results
may influence the accuracy of diagnostic tests and overall study results, it isnt an example of selection bias since it doesn’t affect who gets selected into a study
detection bias
arises from differential methods of detection amongst groups leading to an apparent difference in outcome rates between these groups.
often seen in studies where one group receives more frequent screening or follow-up than another group, thereby increasing changes of detecting the disease earlier or more frequently but doesn’t pertain to selection into a study which defines selection bias
ethnography
qualitative research that seeks to understand and describe the culture or social phenoma from the perspective of the subject group
researches immerse themselves in the setting, observing and participating in daily activities - deep understanding of behaviours/beliefs/experiences in that particular cultural context
bracketing
method used in qualitative research to mitigate the potential deleterious effects of preconceptions that may taint the research process
involves identifying and holding in abeyance preconceived beliefs and opinions about the phenomenon under study
grounded theory
research methodology that involves the collection and analysis of data with the aim of developing theories grounded in real-world observations
seeks to explain phenomena by generating new theories
phenomenology
aims to explore how individuals perceive their experiences
about understanding human behaviour from the individuals own subjective viewpoint
ROC curv
Receiver Operating Characteristic
illustrates diagnostic ability of a binary classifier system as its discrimination threshold is varied
plotted as sensitivity vs 1-specificity
helps in evaluating the performance of diagnostic tests and making informed decisions about cut-off points to maximise sensitivity and specificity
how to calculate standard error of the mean
SEM = standard deviation / square room (number of patients)
alpha level
the probability of rejecting a null hypothesis when it is true
it represents the threshold at which we decide to reject the null hypothesis
commonly set at 0.05
Type I errors
the null hypothesis is rejected when it is true
aka false positive
Type II error
the null hypothesis is accepted when it is false
aka false negative
P-values
the probability of rejecting the null when it is true…
a high p-value indicated a high chance the an observed difference is due to chance and vice versa
if p-value is less than the pre-decided cut off, then you reject the null hypothesis
Randomisation
method used in the design phase of a study to reduce confounding factors
Cumulative incidence
The average risk of getting a disease over a certain period of time.
CI = the number of newly detected cases that develop during follow up / the number of disease free subjects available at the start of follow up
incidence rate
IR = I / PR
I: number of new cases in the cohort
PT: person-time - total time disease free individuals in the cohort are observed over the study period
prevalence
prevalence = incidence x duration of condition
point prevalence
number of cases in a defined population / number of people in a defined population at the same time
period prevalence
= number of identified cases during a specified period of time / total number of people in that population
area under the curve
the higher the AUC, the better the overall performance of the test (the higher the accuracy)
SQUIRE
Standards for Quality Improvement Reporting Excellence
19 item checklist
ensure all aspects of QI are thoroughly and transparently conveyed
MOOSE
Meta-analysis of Observational Studies in Epidemiology
for reporting meta-analyses of observational studies
STARD
Standards for reporting of diagnostic accuracy studies
reporting studies about diagnostic accuracy
PRISMA
Preferred reporting items for systematic reviews and meta-analyses
CONSORT
Consolidated standards of reporting trials
guidelines for reporting RCTs
PICO system
P - patient
I - intervention
C - comparison
O - outcome
Cochrane Library
collection of 6 databases:
CDSR
DARE
CENTRAL
CMR
HTA
NHS EED
Embase
european database
broader range then Medline
PsychINFO
database of abstract of literature in the field of psychology
produced by American Psychological Association
CINAHL
Cumulative Index to Nursing and Allied Health Literature
references to journal articles from hundreds of nursing journals from UK, USA and other countries
OpenGrey
dedicated to grey literature
outside of traditional channels
Boolean Logic
AND, OR, NOT can be used to combine search terms
must be entered in uppercase letters
drug trial phases
1 - small number healthy people. safety, side effects and dose range
2 - larger group (100-300), effectiveness and further safety
3 - large groups (1000-3000), effectiveness, SE, compare to commonly used treatments or placebos
4 - after granted a license, eg safety in pregnancy, finding other potential uses for the drug
How many lie within +/-1SD
68.2%
How many lie within +/-2SD
95/4%
How many lie within +/-3SD
99.7%
What is the Kappa statistic
aka Cohen’s kappa coefficient
gives quantitative measure of the magnitude of agreement between observers
can be any value between -1 and 1
0: agreement observed no better than change
1: complete agreement
-1: complete disagreement
primary evidence
aka empirical research
sources that contain original data and analysis from research studies
secondary evidence
sources that interpret and analyse primary sources.
these sources are one or more steps removed from the event
how to calculate odds ratio
OR = (a/b)/(c/d)
a: exposure yes, outcome yes
b: exposure yes, outcome no
c: exposure no, outcome yes
d: exposure no, outcome no
Fixed effect model
used to measure the impact of variables that vary over time
voluntary sampling
made of people who self-select
eg invited to participate in a poll
same chosen by the participants and not the survey administrator
Convenience sampling
made up of people who are easy to reach
eg approach at hospital cafe
Snowball sampling
one case identifies another of its kind
often done in marginalised groups eg IVDU or sex workers
Quota sampling
population divided into groups and then elements are selected
done to ensure that the sample reflects that characteristics of the population
eg proportionate representation of males and females
Types of random / probability sampling
Simple random sampling
Systematic sampling
Cluster sampling
Stratified sampling
Multistage sampling
Types of non-random / non-probability sampling
Voluntary sampling
Convenience sampling
Snowball sampling
Quota sampling
Simple random sampling
a sample in which every member of the population has an equal chance of being chosen
eg each member of population given unique ID number then randomly selected - often via number generator
Systematic sampling
every nth member of population gets selected for the sample
easier than simple random sampling, but more prone to bias if there is a pattern in the population that is consistent with the sampling frequency
Cluster sampling
Involves dividing a population into separate groups (clusters), and a random sample of clusters is then selected and each element included in the final sample
Stratified sampling
An entire population is first divided into groups (strata) and then a random sample taken from each
this ensures can obtain equal numbers of individuals eg male and female
Multi-stage sampling
more complex method of sampling that involved several steps
two or more sampling methods are combined
allows you to narrow down a large population
likelihood ratio for negative test result
(1-sensitivity)/specificity
Delphi method
method for achieving convergence of opinion concerning real-world knowledge solicited from experts within certain topic area
Background questions
general questions about coniditions/illnesses/pathophysiology etc
Foreground questions
About issues of care - query specialised and distinct knowledge needed for specific and relevant clinical decision-making
Box and whisker plot - interquartile range
‘mid spread’
the difference between the 3rd and 1st quartiles
line in the box on box and whisker plots
median - Q2
left skewed
more on the left of box and whisker plot
negative skewness
right skewed
more on the right of the box and whisker plot
positive skewness
how to calculate prevalence
pre-test odd x likelihood ratio
how to calculate post test probability
post-test odds / (1 + post-test odds)
loss to follow up bias
when follow up cases are lost continuously - lost cases may have something in common resulting in an unrepresentative sample
disease spectrum bias / case-mix bias
when a treatment is studied in more severe forms of a disease
such results may then not apply to mild forms of the disease
sampling bias
the subjects are not representative of the population - may be due to volunteer bias
participation bias / non-response bias
those who participate may have shared characteristics resulting in an unrepresentative sample
incidence-prevalence bias (survival bias, Neyman bias)
occurs in case-control studies and is attributed to selective survival among the prevalent cases (ie. mild, clinically resolved or fatal cases excluded from the case group)
exclusion bias
occurs when certain patients are excluded for example if they are considered ineligible
publication or dissemination bias
many studies may not be published
may be due to fact that paper with positive results, and large sample sizes are more likely to get published
citation bias
articles of high citation are easy to reach and have higher chance to be entered into a given study
berkson’s bias aka admission rate bias
a type of selection bias
can arise when the sample is taken not form the general population but from a subpopulation
eg when cases and controls both sampled from a hospital rather than from the community
detection bias
when exposure can influence diagnosis
eg women on OCP more frequent smears so more likely to have cervical cancer diagnosis
recall bias
in retrospective studies where participants are asked to remember their past exposure to risk factors, it is likely that cases will have thought more about what factors in their past may have caused a disease than controls will have, therefore controls less likely to remember an exposure
lead time bias
lead time is the period between early detection of disease and the time of its usual clinical presentation.
the lead time must be subtracted from the overall survival time of screened patients to avoid lead time bias.
otherwise early detection merely increases the duration of the patients’ awareness of their disease without reducing their morbidity or mortality
interviewer/observer bias
interviewer or observer knowledge about in-question hypothesis and disease and/or exposure can take effect on collection and registry of data
verification and work-up bias
the results of a diagnostic test affect whether the gold standard procedure is used to verify the test result
more likely to occur when a preliminary diagnostic test is negative because many gold standard tests can be invasive, expensive and carry a higher risk
hawthorn effect
when participants alter their usual behaviour due to their awareness that they are being studied
ecological fallacy
when conclusions about individuals are based only on analyses of group data
expectation bias (pygmalion effect)
only a problem in non-blinded trials
observers may subconsciously measure or report data in a way that favours the expected study outcome
late-look bias
gathering information at an inappropriate time eg studying a fatal disease many years later when some of the patients may have died already
tests to check that distribution is normally distributed
the kolmogorov-smirnov test
jarque-bera test
wilk-shapiro test
p-plot
q-plot
purposive sampling
participants selected on purpose because the researcher already knows that they have characteristics of interest
triangulation
compares the results from either 2 or more different methods of data collection, or 2 or more data sources
respondent validation / aka member checking
includes techniques in which the investigators account is compared with those of the research subjects to establish the level of correspondence between the two sets
bracketing
methodological device of phenomenological inquiry that requires deliberate putting aside ones own belief about the phenomenon under investigation or what one already knows about the subject prior to and throughout the phenomenological investigation
reflexivity
sensitivity to the ways in which the researcher and the research process have shaped the collected data, including the role of prior assumptions and experience, which can influence even the most avowedly inductive inquiries
content analysis
interviews (individual and group) are transcribed to produce texts that can be used to generate coding categories and test theories
can involve enumerating procedures such as counting work frequencies, sometimes aided by computer software
constant comparison
based on grounded theory
allows researchers to identify the themes that are important in a systematic way, providing an audit trail as they proceed
used by the researcher to develop concepts from the data by coding and analysing at the same time
calculate pre-test odds
pre test probability / (1 - pre test probability)
calculate post test odds
pre test odds x (likelihood ratio positive result)