Bias and Confounding Flashcards
Overarching categories of bias
Information bias
Selection bias
Selection bias is especially common in ___.
Selection bias is especially common in case-control studies.
Social desirability bias
People tend to systematically overreport things that make them look good, and underreport or underestimate things that make them look bad.
Rule of thumb for confounding
If the effect estimate changes by at least 10% when accounting for the potential confounding variable, it can be assumed that the variable is indeed confounding.
Ways to account for confounding in study design
- Restriction (restrict to only one stratum, eliminating the confounding variable entirely)
- Matching (design a paired study and do paired t tests)
- Randomization of exposure
Effect modification
There is a different level of relationship between the exposure and outcome due to the presence of the effect modifier.
Can something be both an effect modifier and confounder?
Yes!
In this case, the stratum specific OR or RR are different from one another, AND different from the OR and RR overall, in the same direction.
Simple vs complex regression
Simple = 1 independent variable
Complex = 2 or more independent variables
Logistic regression
Used for binary dependent variables. Essentially, you convert the raw data into a percentage likelihood of binary variable x given an independent variable.
The correlation coefficient
r
Ranges from -1 to 1. Absolute value determines strength of the relationship, sign determines direction.
Interpretation of thresholds of r magnitude
r > | 0.6 | implies a strong correlation
r > | 0.8 | implies a very strong correlation
Format of an equation derived from linear regression
y = β0 + β<span>1</span> x + e
β0 = intercept
β1 = slope
e = error term / residuals
“Goodness of fit” measure
r2
When testing whether or not a relationship determined by linear regression is statistically significant, the null hypothesis is. . .
. . . that the predicted value of y should be the average value of y for all sample datapoints regardless of the value of x.
A simple linear regression model for a binary independent variable is effectively the same as . . .
. . . a two sample t test.
Nondifferential bias
The frequency of errors is approximately the same in the groups being compared.
In general, nondifferential misclassification tends to result in estimates of effect that are closer to “null” than the true effect.
hazard ratio
Expression of relative risk which quantifies the probability of an event (e.g. dying) during a particular time interval, given that a subject has survived until that time
Multivariate linear regression
y = intercept + b1 x1 + b2x2 + residual error
Multivariate logistic regression
ln(p/1-p) = intercept + b1 x1 + b2x2 + residual error
where p/1-p is the odds ratio of condition y.