Stats Flashcards
Likelihood ratio of positive test result
sensitivity / (1-specificity)
Median
middle item in a data set which has been arranged in numerical order
mode
most frequent item in a data set
mean
add all items in data set together and divide by the number of items
Relative risk reduction
ARR / CER
ARR: absolute risk reduction (the difference between the two rates in control and treatment group)
CER: event rate in the control group
What are funnel plots primarily used for?
Assess for potential publication bias in meta-analyses
Graph the size of the effects found in individual studies against a measure of the study’s precision or size
Chi-squared test (4)
Used to assess differences in categorical variables
Non-parametric test
Applies assumption that the sample is large
Compares the observed frequencies against those that would have been expected if there was no difference and then produces a value which can be used to assess if the difference is significant (p<0.05)
Pearson’s correlation coefficient
- Measures linear correlation between 2 variables
- sign of the correlation coefficient tells us the direction of the linear relationship: negative then trend line slopes down, positive then trend line slopes us
- the size/magnitude of the correlation coefficient tells us the strength of a linear relationship: >0.90 = strong, 0.65-0.9 = moderate, <0.65 = weak
- parametric test
- if the data is non-parametric or if both variables are not ratio variables then Spearman’s should be used
The 3 types of t-test
- one sample t-test
- independent t-test
- paired t-test
one sample t-test
- used to see if there is a difference between a sample mean and the hypothesised population mean
independent t-test
- used when you want to compare means from independent groups
paired t-test
- used when comparing the means of two groups that are considered to be paired (matched, or dependent)
ANOVA
- statistical test to demonstrate statistically significant differences between the means of several groups
- similar to a student’s t-test apart from that ANOVA allows the comparison of more than just 2 means
- assumes that the variable is normally distributed
- works by comparing the variance of the means
- distinguishes between within group variance and between group variance
- the null hypothesis assumes that the variance of all the means is the same as between group variance
- the test is based on the ratio of these two variances, known as the F statistic
Relative risk
RR = EER / CER
EER: treatment group risk
CER: control group risk
NNT - number needed to treat
- used in assessing the effectiveness of a healthcare intervention
- represents the average number of patients who need to be treated to prevent one additional bad outcome or produce one additional good outcome
RISK
- a proportion
- probability with which an outcome will occur
- usually expressed as a decimal between 0-1
- often expressed as a number of individuals per 1000
- if risk is 0.1, in a sample of 100 people, the number of events observed will on average be 10
ODDS
- odds is a ratio
- the ratio of the probability that a particular event will occur to the probability that it will not occur
- can be any number 0-infinity
- commonly expressed as a ratio of 2 integers, eg odds of 0.01 would be 1:100
absolute risk
basic risk
in many studies it will just be the incidence rate
in experiments, will be the number of events in that group divided by the number of people in the group
risk difference / absolute risk reduction
the difference between the absolute risk of an event in the intervention group and the absolute risk in the control group
relative risk
the ratio of risk in the intervention group to the risk int he control group
1 = estimated effects are the same for both interventions
used in cohort, cross-sectional and randomised control trials
Positive predictive value (PPV)
the probability that subjects with a positive screening test truly have the disease
PPV = true positives / (true positives + false positives)
sensitivity
how well a test can identify true positives from all actual positives
sensitivity = number of true positives / (true positives + false negatives)
specificity
how accurately a test identified those without a condition/disease
specificity = number of true negatives / (true negatives + false positives)
accuracy
how close measurements are to ‘true values’
negative predictive value (NPV)
likelihood that subjects with a negative screening test truly do not have the disease
NPV = number of true negatives / (true negatives + false negatives)
how to calculate NNT
NNT = 1 / (CER-EER)
or
NNT = 1 / absolute risk reduction
arithmetic mean
adding up all the values and dividing by the number of values
harmonic mean
calculated by dividing the number of observations by the sum of the reciprocal of the value
used when there is a time factor involved eg speed
generalised mean / power mean
involves raising each value to a specific power, adding together, taking average and then taking the root of that average
range
difference between largest and smallest values
interquartile range
aka the mid spread
difference between the 3rd and 1st quartiles
ratio / continuous data
like interval but have true zero points
eg kelvin scale temp
interval data
measurement where the difference between 2 values is meaningful
eg temperature, pH
ordinal data
observed values can be put into set categories which themselves can be ordered
eg social class
nominal data
observed values can be put into set categories which have no particular order or hierarchy.
you can count but not order or measure nominal data
eg birthplace, eye colour
quantitative data
numeric values
can be further classified into discrete and continuous types
qualitative data
not numerical, usually names
AKA categorical or nominal variables
endemic
consistent presence and/or usual prevalence of a disease in a population within a geographical area
epidemic
refers to an increase, often sudden, in the number of cases of a disease above what is normally expected in that population in that area
pandemic
an epidemic that has spread over several countries or continents, usually affecting a large number of people
standard error of the mean
standard deviation / square root (number of patients)
GRADE system
Grading of Recommendations Assessment, Development and Evaluation
rates the quality of evidence in systematic reviews and guidelines
classified as high, moderate, low or very low
internal validity
the confidence that we can place in the cause and effect relationship in a study.
the confidence that we have that the change in the independent variable caused the observed change in the dependent variable (rather than due to poor control of extraneous variables)
external validity
the degree to which the conclusions in the study would hold for other persons in other places and at other times
ie. its ability to generalise
face validity
the general impression of a test
if it appears to test what it is meant to
content validity
the extent to which a test or measure assesses the full content of a subject or area.
criterion validity
concerns the comparison of tests
you may wish to compare a new test to see if it works as well as an old, accepted method
the correlation coefficient is used to test such comparisons
criterion validity (concurrent)
the predictor and criterion data are collected at or about the same time
criterion validity (predictive)
the predictor scores are collected first, and criterion data are collected at later point
want to know if the test predicts future outcomes
construct validity
the extent to which a test measures the construct it aims to
construct validity (convergent)
has convergent validity if it has a high correlation with another test that measures the same construct
construct validity (divergent)
demonstrated through a low correlation with a test that measures a different construct
cost effectiveness analysis (CEA)
compares a number of interventions by relating costs to a single clinical measure of effectiveness
cost effectiveness ratio = total cost / units of effectiveness
combines costs and effects - usually reported as an incremental cost-effectiveness ratio (ICER)
cost benefit analysis (CBA)
technique in which all the costs and benefits of an intervention are measured in terms of money
used to establish which of the alternatives has the greatest net benefit
requires that all the consequences of an intervention, such as life years saves, symptom relieve etc are all allocated a monetary value
cost-utility analysis (CUA)
special form of CEA in which health benefits / outcomes are measured in broader, more generic ways enabling comparisons between treatments for different diseases and conditions
cost minimisation analysis (CMA)
economic evaluation in which consequences of competing interventions are the same and in which only inputs (costs) are takin into consideration
the aim is to decide the least costly way of achieving the same outcome
test-retest reliability
assessed the stability of a measure over time by administering the same test to the same individual on two different occasions
split-half reliability
assesses the internal consistency of a test by dividing it into 2 halves and comparing the results of each half
ensures consistency within the test items, but does not address stability of the tool over time
parallel-forms reliability
involved administering two equivalent forms of a test to the same group and comparing results.
valuable for avoiding practice effects
internal consistence reliability
measures how consistently items within a test measure the same construct, often using statistical methods like Cronbach’s alpha.
does not assess stability over time
inter-rater reliability
assesses the consistency of scores when different raters or observers administer the test
critical in situations where multiple clinicians assess the same patient, but not relevant to determining whether tool yields stable results for the same individual across repeated administrations
forrest plot weighting
indicated influence an individual study has on pooled result
generally, bigger sample size AND narrower confidence interval, the higher the weight
shown by larger box
heterogeneity in forrest plots
refers to variability between studies and can affect the ability to combine the data of the individual studies