Biologic methods and stats Flashcards
hormones for which measurement by standard assays are influenced by binding proteins
1) Free T4 and T3: thyroid binding globulin can alter measurement of total T4 and T3
2) Testosterone: binds to albumin and SHBG. Free testosterone can be measured or calculated
3) Estrogen: binds to albumin and SHBG
4) Cortisol: binds to corticosteroid binding globulin and albumin. 90% of cortisol is bound to CBG. Free cortisol can be measured with urine sample for 24 hours
5) Vitamin D (1,25 and 25): binds to vitamin D binding protein (DBP) and albumin.
6) Growth hormone: growth hormone binding protein
what can interfere with assays
how to tell
- autoantibodies - will bind to the ligand and prevent it from binding to assay
- heterophiles antibodies - bind to capture Ab/assay and block the hormone receptor sites
both of these make it appear as if there are less substrate/hormone present in the sample
if you dilute the sample it will not dilute in a linear fashion
hook effect
solid state assay
when large amounts of substrate are present
direct binding of the reporter Ab preventing the reported Ab from binding to the surface
ex: macroPRL
serial dilution
what are normal ranges in parameters
central 95% of unaffected population
2SD above and below mean
2.5% above and below are normal
what is the null hypothesis
“ there is no difference in the frequency of ‘x’ between 2 groups”
what is type 1 error
alpha
Incorrectly rejecting null hypothesis
ie when you find there is a difference but it’s just by chance
what is type 2 error
beta
Stating there is no difference when you just didn’t have enough subjects to detect the difference
what is power
1-beta
Probability the study could have detected a difference
Directly related to sample size and magnitude of the difference
what increases your chances of having a type 1 error
Making multiple individual comparisons (≥20) can generate a p value ≤ 0.05 by chance alone
how do you correct for multiple comparisons
Divide the desired type I error rate by the number of comparisons to be made.
For example, with 3 comparisons:
.05/3 = .017
The new significant p value for comparisons is 0.017 and not 0.05.
what does p = 0.05 mean
5% chance that the difference you found was due to chance alone
Therefore if you make 20 comparison, you should find statistical significance somewhere
what do you typically set beta at
.20 (power = 1 – ß) or a 20% chance that you will fail to find a difference that actually does exist.
if you have 2 independent samples with normal distribution, what kind of test should you use to analyze?
what if it is more than 2?
t-test
ANOVA
if you have 2 independent samples with NOT normal distribution, what kind of test should you use to analyze?
what if it is more than 2?
Mann-Whitney U
Wilcoxon Rank Sum Test
Kruskal-Wallis
what is odds ratio
Odds ALWAYS implies a ratio of two probabilities.
Probability of event happening over probability of event not happening
OR is a ratio of two ratios.
if something has a probability of 80%, what’s the odds?
80:20 = 4
relative risk
ratio of percent of those with a risk factor who have the disease compared to those without the risk factor who have the disease.
when to use RR and when to use OR
RR
Useful in large prospective cohort studies
OR
case control
Retrospective studies
Number Needed to Treat (NNT)
The NNT is the number of patients who need to be treated in order to prevent one additional “outcome”.
how to calculate NNT
NNT = 1/ARR
what is ARR
how to calculate
Attributable (Absolute) Risk Reduction
ARR = risk of outcome in non-intervention group – risk of outcome in intervention group
The difference between the control
group’s event rate and the experimental group’s event rate.
RRR
Relative Risk Reduction = ARR/placebo or non-intervention group rate
95% confidence interval
A 95% confidence interval (95% CI) is the range of values which we can be 95% confident includes the population statistic from which the study sample was drawn
how to interpret confidence intervals:
what is the null value for a mean?
what is the null value for OR/RR/Hazard Ratio?
Null value is 0
If 95% CI includes 0, not statistically significant
Null value is 1
If 95% CI includes 1, not statistically significant
What is a bias
“any systematic error in an epidemiologic study that results in an incorrect estimate of the association between exposure and risk of disease”
what is selection bias
Patient selection is not uniformly performed.
Patients selected are different than those not selected.
Recall bias
- Differences in the accuracy of recalling past events/exposures between cases and controls
Measurement bias
Systematic error in the measurement of data.
Misclassification Bias
-Wrongly classifying a subject/mislabeling them
what is a confounding variable
A confounding variable is associated with both the risk factor or ‘exposure’ and the disease being studied.
Can either inflate or deflate the true magnitude of the association
Should not be an intermediate link between exposure and disease.
what is a correlation
Determine the strength of the linear relationship between two continuous variables.
Its value can range from -1 to +1.
closer to -1 to +1 indicate stronger
0 indicates no correlation
what is regression
what are the types
Any statistical technique which focuses on the relationship between a dependent variable and one of more independent variables.
Linear regression – dependent variable is continuous or interval variable
Logistic regression – dependent variable is dichotomous (yes/no, dead/alive)
Prevalence
Proportion (or fraction) of a group possessing a clinical condition at a given point of time.
Incidence
Proportion (or fraction) of a group initially free of the condition that develop it over a period of time.
sensitivity
a/(a+c) or TP/(TP + FN)
Proportion of people with the disease who have a positive test for the disease
if high, Very few false negatives
R/O disease
Specificity
d/(b+d) or TN/(TN + FP)
Proportion of people without the disease who have a negative test
if high, Very few false positives
R/I disease
Positive Predictive Value
a/(a+b) or TP/(TP + FP)
Probability of disease in a patient with a positive or abnormal test.
Negative Predictive Value
d/(c+d) or TN/(FN + TN)
Probability of not having the disease when the test result is normal or negative
Positive Likelihood ratio
Probability of test result in the presence of disease
OVER
Probability of test result in people without disease
Likelihood Ratio = Sensitivity OVER
1-Specificity
Divides the probability that a patient with the disease will test positive by the probability that a patient without the disease will test positive
what do LR mean
> 10 suggests large and conclusive change in pretest to posttest probability
5 - 10 suggests moderate change
2 - 5 suggests small, although occasionally important changes in probability
1 -2 suggests small, rarely important changes in probability
< 1 decrease the probability of disease
Receiver Operator Characteristic (ROC) Curves
what are on the axes
Sensitivity on the y-axis
1- specificity on the x-axis
Plotting the true positive rate against the false positive rate
Describes accuracy of test over a range of cut-off points
Overall accuracy of test described by the area under the curve (the larger the area the better the test)
Case Reports
Presentations of single case or handful of cases
Important way for unusual diseases or unusual presentations of disease are brought to attention
Case series
Prevalence survey of a group of individuals with a particular disease at one point in time
Describes clinical manifestations of disease including both purported causes and effects
Cross-Sectional or Prevalence Studies
Prevalence: fraction or proportion of the group who are diseased
All people examined, including cases and noncases
Single point in time
Cohort Studies
Group of people (cohort) is assembled none of whom has experienced the outcome of interest
People are classified according to characteristics that might be related to outcome
People are observed over time
Relate initial characteristics to subsequent outcome events
Advantages and Disadvantages Of Cohort Studies
Advantages:
Establishes incidence
Follows logic, if people are exposed, will they get the disease?
Exposure elicited without bias since outcome is not known
Can asses relationship between exposure and many diseases
Disadvantages:
Inefficient as many more subject must be enrolled than will experience the outcome of interest
Expensive
Results not available for a long time
Can only assess relationship between disease and exposure to relatively few factors recorded at the outset of study
Case control study
Patients with the disease and a group of otherwise similar people who do not have the disease are selected
Researchers look backward to determine the frequency of exposure in the two groups
Estimate the relative risk of disease related to exposure
Advantages and disadvantages of Case Control
Advantages
Cases can be identified unconstrained by the natural frequency of disease
Good for rare disease
Look at many exposure at the same time
Do not need to wait a long time for the answer
Able to address important questions rapidly and efficiently
Disadvantages
Can only estimate relative risk
Incidence rates not measured
Fraught with bias
Selection of controls – controls and cases must have an equal chance of being exposed to risk factor
Measuring exposure affected by presence of disease
advantages of disadvantages of RCT
Advantages
Minimize selection bias
Equal distribution of known and unknown risk factors for disease (confounders).
Often both provider and patient are blinded to study.
Disadvantages
Expensive
Time consuming
Often, patients and/or providers figure out which arm they are in (placebo pill tastes different)
Systemic Review versus Meta-Analysis
Systemic review answers a defined research question by collecting and summarizing all empirical evidence that fits pre-specified eligibility criteria.
Meta-analysis is the use of statistical methods to summarize the results of these studies.
An article describing a test - what are criteria for it to be significant
i. P value
ii. 95% CI
iii. Sens & Spec
iv. PPV & NPV
v. Power: ability of test to detect a true difference
what is What is the statistical term to describe the precision of the relative risk
confidence interval
what is lead time bias
Lead time is the interval between the diagnosis of a disease at screening and when it would have been detected due to development of symptoms.
= Represents the amount of time by which the diagnosis has been advanced asa result of screening.
Lead time bias occurs when a screened population appears to have longer survival compared to an unscreened population because the diagnosis was simply made earlier because of screening (vs. actually prolonging disease survival).
what is length time bias
Overestimation of survival duration due to the relative excess of cases detected that are slowly progressing
♣ Screening is more likely to detect cases of diseases in individuals with a longer
preclinical phase, and therefore whose disease is progressing more slowly.
These people are likely to have a better prognosis
♣ This means that individuals detected by screening are likely to have longer survival because their disease is likely to progress more slowly vs. those who
go to MD with symptomatic disease
♣ Results in apparent increase in survival among people with disease detected by screening… overestimation of the benefit of screening.
Negative likelihood ratio
= (1–sensitivity)/specificity
i. Divides the probability that a patient with the disease will test negative by the probability that a patient without the disease will test negative