Biostats Flashcards
Selection bias
How were patients selected, are groups similar?
Big with case control, also see with meta analysis
Allocation bias
Group assignments, randomization
*groups not representative of population being studied
Misclassification bias
Participant placed in the wrong category
Measurement bias
AKA detection bias
Data collection issue
Attrition bias
Patient drop out
Compliance bias
Was compliance assessed/ could compliance effect results
As treated analysis
All patients randomized according to therapy they actually received
Per protocol
Only pts who followed protocol were analyzed
ITT
Analyzed according to intended therapy
Nominal data examples
ADR rate Yes/no, gender, race, presence or absence of dx, death, hospitalization
Ordinal examples
Likert, NYHA functional class, years of therapy 0-5 5-10 10-20, age <50 50-75 75-100
Continuous data examples
Lab values, age, weight, time to event
HR vs RR
HR is for survival analysis
HR is weight RR over time
Kaplan meirer
How survival analysis (log rank test) is presented - paired with HR
Cox proportional
Most common stat test for survival analysis - used for mutivariate analyses
Mx comparisons procedure issue and fix
Increased risk type 1 error
Correct: bonferroni, tukeys, scheffe, dunnetts, hochberg
Funnel plot shows what?
Publication and selection bias
Want symmetry around the middle (shows no bias)
Cohort v case control
Cohort is prospective, case control is retrospective
Cohort: start w/ risk factory
Case: starts w/ cases
Hawthorne effect
People modify behavior bc they know they are being observed
Cost of illness
Cost of dx for define population
Cost minimization
Intervention cost differences b/w similar alternatives
Identifies least costly alternative when consequences are the same
Ex. Losartan vs valsartan
Cost benefit
Identifies net cost impact of an interventions
Compares programs or agents with different objectives
Strengths and weakness of interventions (cost of intervention vs cost that we get back via benefit)
Example: building cost $100 to make but will yield $200 in profit ($200-$100)
Cost effectiveness
Net cost divided by health outcomes
Ex. Cost per case of dx prevented or per death averted,
Ex. Years if life saved, number symptom free days, BG, BP
Cost utility
Sunset of cost effectiveness
When tx affects quality of life QALY
Which type of data should not report means and SDs?
Ordinal
Types of quantitative data
Discrete : 1, 2, 3 etc
Continuous:
-interval: zero is arbitrary (degrees F)
-ratio: absolute zero (HR, BP, distance, kelvin)
When can you use mean
Continuous and normally distributed
NOT ordinal!
When to use median?
Ordinal or continuous
Good for skewed data
When to use SD
Continuous and normally distributed
SD meaning
1: 68%
2: 95%
3: 99%
Kolmogorov-smirnov
Formal test for normal distribution
CI vs p-values
*important slide!
CI helps us determine importance of findings! Clinical significance
P-value tells us nothing of importance, only of certainty
F test
Difference in variance
Mantel haenszel
For independent nominal 3+ groups
Controls for confounders
Pearson v spearman
Pearson: correlation of continuous normally distributed
Spearman: correlation of non-normally distributed continuous data or ordinal data
r vs r2
R is correlation and r2 is regression
-Correlation cannot show causality but regression can
-correlation can be negative or positive but regression cannot show direction
Log rank test
For survival analysis of two independent groups (presented via Kaplan Meier curve)
Cox proportional hazards
Survival analysis for more than 3+ groups or paired groups
Or for prediction..?
Which type of bias do we see with case control studies?
Recall bias
Also selection
When can you not assess NNT /NNH?
When results are not significant!
Observation or information bias
Incorrect determination of outcomes or exposure
Ex. Inaccurate recording of risk factor
Recall bias
Recollection of past events. Cases more likely to recall exposure than controls (big for case-control studies)
Interview bias
Interviews not conducted in uniform manner
Publication bias
Positive results more likely to be published
Differential vs non-differential bias
Differential effects one group more
Non-diff effects both equally (systematic error)
Observational studies
Case control and cohort
*shows correlation but not causation
Cross sectional
Snapshot in time- prevalence study
Not as good as case control or cohort
Incidence vs prevalence
Incidence: # of new cases per time period
Prevalence: number of cases at a given time
Pragmatic trial
More real life conditions- less internal validity
ITT, per protocol, as treated
ITT: underestimates benefit. Preferred in superiority trial
Per protocol: overestimates benefit (results only for adherent pts- reduced external validity). Preferred in non-inferiority trial
As tx: interpret with caution! Destroys randomization for non-adherent patients
Systematic review vs meta analysis
Meta analysis uses mathematic analysis
SEM calulcario
SD/sqrt of n
How to get 95% CI from SEM
SEM x1.96
Add and substract that from the mean
When to use fischers exact over chi squared
All values in 2x2 table are at least 5 to use chi squared- otherwise uses Fischers exact
Kendal
Correlation of ordinal values
Linear vs logistic regression
Linear: continuos outcome
Logistic: categorical outcome
Why are composite outcomes even used?
Increase power and decrease sample size requirements
Note: components should be physiologically linked!!!
Effect size (delta) relationship to power
Increased effect size increases power and allows for smaller sample size
Another benefit of survival analysis
Loss to follow up- you can still use data up til the loss to follow up
Why are systematic reviews highest level of evidence
They reduce bias
Confounding by indication
Lipid lowering drug compared to non lipid lowering drug…therefore actually kinda comparing high cholesterol to low cholesterol
People who take a drug are inherintly different from those who don’t d/t dx it’s treating, not the drug itself
95% CI equation
2x SEM in both directions
SEM= SD/sqrt of n
Coefficient if variation equation
SD/mean X 100
Uspstf
Recommends health screening
FEMA
Federal emergency management agency
For emergency preparedness
Emergency response groups
American Red Cross, national pharmacy response team, disaster medical assistance team
Ethnics framework
For cultural competence
Explanation, treatment, healers, negotiation, intervention, collaboration, spirituality
CLAS
Culturally and linguistically appropriate services
When to use mantel-haenszel
It control for confounding- like Ancova
Regression- simple vs mx, linear vs logistical
Simple - 1 dependent variable
Mx- 2+ dependent variables
Linear- continuous
Logistic- categorical
Measurement of collection for cohort vs case control
Cohort: RR
Case control: OR
Reporting guidelines for clinical studies
Consort, strobe (observational), prisma (meta analysis) was previously quorum, equator
OR
(A/B)/(C/D)
Outcome I No outcome Intervention. A. B No intervention C. D
RR
(A/A+B)/(C/C+D)
Outcome I No outcome Intervention. A. B No intervention C. D
ARR
(C/C+D) - (A/A+B)
Outcome I No outcome Intervention. A. B No intervention C. D
RRR
1-RR
NNT/ NNH
1/ARR round down for NNT and up for NNH
Cochran Q
Traditional test for heterogeneity
P< 0.05= high heterogeneity
I^2 (I squared)
Degrees of heterogeneity
0-25% = low
26-50%= moderate
>50%= high
Don’t want high heterogeneity in a meta analysis
Fixed effect vs random effect
Use fixed effect if low heterogeneity and random effect if high heterogeneity
Non-inferiority margins questions
Words question: to be NI, CI cannot cross the bound if the NI margin
-for
Superiority CI can’t contain 0 or 1 (duh)
Pictures:
-superiority: doesn’t cross middle line
-NI: can cross middle line but not the upper or lower limit of the CI
-equivalent: upper and low limit are the same as upper upper and low limit of CI
Specificity and sensitivity
Specificity: True negative rate, low
Number of false positives
D/B+D
Sensitivity: True positive rate, low number of false negatives
***A/A+C+++
Truth (+) I Truth (-) Test (+). A. B Test (-). C. D
Positive and negative predictive value
PPV: probability that person with positive test has a the condition
A/A+B
NPV: probability that person with negative test does not have the condition
D/C+D
Truth (+) I Truth (-) Test (+). A. B Test (-). C. D
PFS bias types
-measurement bias
-assessment time bias- difficult to determine exact progression date
Oncology endpoints
ORR (objective response rate): dx activity (in metastatic setting)- does drug effect the dx after
pCR (pathological response): determines dx activity (in neoadjuvant setting)- does drug effect dx before
RFS (relapse free) or DFS: determination of dx recurrence or death
PFS: determination of dx progression (in metastatic setting)- time until progression or death
OS- survival
Correlation coefficient and coefficient of determination
Correlation coefficient: r (correlation)- strength and direction of relationship
Coefficient of determination: r2 (regression)- estimates the variation in the outcome due to the independent variable