Medical Statistics Flashcards
P-value
probability of rejecting H0 with the value of the test statistic obtained from the data given H0 is true
Confidence interval
measures the degree of uncertainty or certainty in sampling method, and is the range of plausible values for an unknown parameter (finding test statistic)
Odds ratio
describes the strength of the association between two events, ad/bc
the ratio of the odds of A in the presence of B and the ratio of the odds of A in the absence of B. p1/1-p1 / p2/1-p2
Confounding
a variable that influences the independent and dependent variable (we want to eliminate this in a test) (true effect on X and Y is hidden by confounding variable c)
Interaction
relationship between three or more variables
Z distribution and t distribution
z is when sample is large and variance is known, and t is when sample is small and variance is unknown
Tables, figures and listings
how data is represented and analysed for clinical trials with SAS (type of software they use in pharmaceutical companies, especially when considering clinical trials)
Efficacy
the effect in ‘perfect conditions’
Effectiveness
the effect in real world conditions
Crossover trial
when subjects receive a sequence of different treatments, but the order they receive them might be randomised
ANCOVA
analysis of covariance, comparing one variable in two or more populations while considering other variables (ANOVA doesn’t consider other variables)
End points
outcomes measures referring to occurrence of disease, symptom, sign, or laboratory abnormality constituting a target outcomes
Equivalence trial
statistical test which aims at showing that two treatments are not different in characteristics (so not too different)
Non-inferiority trial
demonstrates that the test product in not worse than the competitor by more than a pre-specified amount (so is significantly better than the other)
Block randomisation
randomising patients in blocks such that an equal number are assigned to each treatment
Stratification
partitioning of subjects and results in a way other than the treatment given
Drop-out
when patients leave the trial prematurely
Power
probability of avoiding a type 2 error (when the type 2 error is to accept a false hypothesis), (1 - p(type 11 error)
Classification
by purpose or by phase in drug development
Phase I
focus upon the Pharmacokinetics/Pharmacodynamics (absorption, distribution, metabolism and excretion of a drug or vaccine) and toxicity (drug safety); Maximum Tolerated Dose (MTD)
Phase II
initial clinical investigation into doses and dose schedules, (dosefinding,) and early indications of efficacy
(e.g. dose-finding study)
Phase III
aimed at full scale evaluation, efficacy, of a new, experimental, treatment compared to a standard
therapy or placebo, acting as a control
Phase IV
effectiveness, post marketing
surveillance (information re: uncommon
side effects, long-term effects)
Effect
difference between what happened
to the patient as a result of treatment and
what would have happened if treatment had
been denied’
Efficiency
the economics of treatment
International Guidelines
Guidelines for Good Clinical Practice, particularly for
assessing safety and efficacy of medicinal products
Trialists objectives
minimise bias and maximise efficiency
Key design issues include:
replication, control, randomisation, blocking, treatment blinding/ masking, ethics, choice of analysis set,
aim
to yield treatment groups which are indeed
comparable in terms of extraneous factors
Block Randomisation
balance numbers of participants in each
group
Stratified Randomisation
use block randomisation in each stratum
Adaptive randomisation: Minimisation
use simple randomisation when
groups are balanced: when imbalanced allocate
next patient to treatment so that imbalance is
minimised (via minimisation score)
Treatment Blinding: single blind
patient does not know which
treatment is being received
Treatment Blinding: double-blind
neither patient nor physician knows the treatment allocation (GoldStandard)
Treatment Blinding: triple-blind
as double-blind also the monitoring group and data analyst do not know which
group receives experimental and which control
treatment
Treatment Blinding: open
all are knowledgeable about the treatment allocation
placebo
dummy copy of treatment
double-dummy
method for comparison of 2
active treatments with different appearance (ie two treatments which is in two different forms, drug A and drug B, with placebo A and placebo B)
Central Limit Theorem
when independent random variables are added, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed
risk difference
RD = a/(a + c) - b/(b + d)
I think it’s the additional risk of taking the drug over not taking it
relative risk
RR = (a/(a + c))/(b/(b + d))
the ratio of the probability of an outcome in an exposed group to the probability of an outcome in an unexposed group
parallel group design
different groups of
patients are studied concurrently (in parallel).
Patients receive a single therapy (or combination
of therapies) −→ estimate of treatment effect is
based upon so-called ’between-subject’ comparisons. We used the two independent samples
t-test for inference
paired design
patient receives both treatment
for example, matching parts of anatomy (e.g.
limbs, eyes, kin etc) −→ estimate of treatment
effect is based upon ’within-subject’ comparison.
We used a paired t-test (one sample t-test
on the within subject differences). We noted
asymmetry can be problematic
wash-out period
A wash-out period is a period in a trial during
which the effect of a treatment given previously is believed to disappear. If no treatment
is given during the wash-out period then the
wash-out is passive. If a treatment is given
during the wash-out period than the wash-out
is active
Epidemiology
study of the distribution
and determinants of disease in human populations.
target population
population about we wish to draw inferences for
study population
population from which data are collected
generalisability
can we use the study population results to draw accurate conclusions about the target?
incidence
number of new cases of the disease within a specified period of time
prevalence
number of existing cases of the disease at a particular point in time,
P = (number of people with the disease at a specified time t)/(number in the population at risk at the specified time t)
incidence rate
I = (number of people who develop disease in a specified time period)/(sum of the length of time during which each person in the population is at risk)
cumulative incidence risk
CI = (Number of people who get a disease during a specified period)/(Number of people free of the disease at the beginning of the period)
crude mortality rate
number of deaths in a specified period of time, divided by the average population at risk during that period
multiplied by the length (years) of the study
period (person-years)
sensitivity
the proportion of truly diseased persons
in the tested population who are identified as diseased by the screening test (probability of diagnosing a true case as diseased)
P(Truly diseased given Diseased)
specificity
the proportion of truly non-diseased persons who are so identified by the screening test
(probability of diagnosing a truly non-diseased
person as non-diseased):
P(Truly not diseased given non diseased
positive predictive value
is the proportion of persons who are in fact diseased among those who test positive
negative predictive value
the proportion of persons who are in fact non-diseased among those
who test negative
receiver operating characteristic
sensitivity, specificity and their sum can be plotted to
evaluate the various cut-points but typically a plot
of sensitivity against (1 - specificity) is produced
for the various cutoff values
likelihood ratio
sensitivity/(1 - specificty)
observational studies
in which the investigators role is passive in that exposures are not manipulated
intervention studies
in which the investigators
role is active: groups are exposed to interventions
of interest such as treatments. These are experimental studies: clinical trials
cross-sectional studies
surveys also feature but since they provide information at a snap-shot in time (both exposure and disease) they are less useful and can only be used to measure disease prevalence
Cohort Studies
A cohort study tracks two or more groups
forward from exposure to outcome
Case-Control Studies
case and control groups defined and selected
according to their disease status: diseased /
non-diseased (outcome to exposure)
matching
selecting controls to be similar to
cases in terms of confounders (e.g. age, gender,
smoking habits)
stratification
examine exposure-disease associations within strata (e.g age groups, smoking groups) and estimate pooled estimate of association measure adjusting for confounding effect
standardisation
controlling confounding using an external population to adjust for age, gender etc yielding standardised rates
multivariate analysis/regression models
include confounding variable in the model (model
adjustment
Standardisation
to compare the incidence of disease or mortality between two or more populations
direct standardisation
the disease rates in the population of interest are applied to the standard population counts
indirect standardisation
the disease rates in the standard population are applied to the population of interest
direct standardised event rate
the expected event rate in the ’standard’ population if the age-specific event rates in the study population
prevailed
survival analysis
analysis of data in the form of times from some well-defined time origin to occurrence of some event or endpoint”
3 types of Survival Analysis
Positive (such as discharge from hospital or time
to conception)
Adverse (such as death or recurrence of disease)
Neutral (such as cessation of breast feeding)
Right Censoring
the event time exceeds the last follow-up time
Left Censoring
the event time precedes the last follow-up time but is unknown
survivor function
the survival time of individual i
hazard function
a realisation of a non-negative random variable T
empirical survivor function
S = (number of individuals with survival times > t) / (total number of participants)
Type 1 error
h0 is rejected but h0 true
Type 2 error
h0 is accepted but h0 false
Mantel-Haenszel Method
technique that generates an estimate of an association between an exposure and an outcome after adjusting for or taking into account confounding
Hazard function
f(x)/ 1 – F(x)
Kaplan-Meier function
The Kaplan Meier Curve is an estimator used to estimate the survival function. The Kaplan Meier Curve is the visual representation of this function that shows the probability of an event at a respective time interval
Survival analysis
analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems.
Non parametric models
Non-parametric models assume that the data distribution cannot be defined in terms of such a finite set of parameters. But they can often be defined by assuming an infinite dimensional θ.
Logisitic regression
like linear models but for discrete data
Case control study
type of observational study in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. (outcome to exposure)
Cohort study
exposure to outcome: Conducting a study by selecting groups, then waiting a period of time before comparing the groups ie after initial exposure we then pick the groups and wait for outcome
type of test to use when input is nominal and output is normal
t test
type of test to use when input is continuous and output is nominal
logisitic regression
test which tests if there is an association between two variables
Chi squared test
Period Effect
Period effects may arise where patients may do better in a subsequent period because their state has changed, for example, their mental or health status has changed, independent of treatment.
Carry over
If the effect of a treatment carries on after the treatment is withdrawn then the response to a second treatment may well be due in part to the previous treatment
Gold standard
In medicine and statistics, a gold standard test is usually the diagnostic test or benchmark that is the best available under reasonable conditions. Other times, a gold standard is the most accurate test possible without restrictions.
Receiver operating characteristic
a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied