Lecture 18: Multivariable Models Flashcards
Review: What will happen if confounding is not controlled?
-We will get a biased estimate of the association of the exposure variable of interest with outcome
-ex the RR or OR for the association b/w exposure of interest and outcome will be incorrect
Which method to control confounding will be useful in this lecture and why?
- Multivariable modeling
-bc it can be used to control for multiple potential CFVs
What is a recap from the video shown in class?
-line of best fit = regression line
- But instead of y=mx+b turns where mx is slope and b is y-intercept, the notation Y= Bo +B1X1is used
EX
Y (income)= B0 (the intercept) + B1 (education)
-if B1 is a pos number: positive relationship with outcome
-if B1 is neg number: negative relationship with outcome
-if B1 = 0 no relationship
Why do we use multivariable models?
-Predict the value of a dependent variable (Y) from the explanatory variables (X’s) using the optimal equation which relate Y to the X values
-Determine which explanatory variables are important predictors of the dependant variable and the extent to which they influence the dependent variable
-Study the relationship b/w the dependent variable and each of the explanatory variables while controlling for the effect of other explanatory variables
-Multiple regression allows the investigation of multiple explanatory factors at once
What are the dependant/outcome variables (Y)?
-Can be continuous –> use LINEAR regression
ex body weight, milk production, blood glucose concentration
-Can be dichotomous–> use LOGISITIC regression
ex 2 possibilities pregnant or not, diseased or not
*the explanatory variables can be continuous, dichotomous or categorical with either regression approach, choice of model depends on outcome not exposure
What are the independent/explanatory variables (X)?
Continuous: age, weight, gestation length (weeks)
Dichotomous (yes/no): male/female, vax/not vax, etc
Categorical: (greater or equal to 2 levels): breed, nationality, education level, age (if range: 10-20yrs)
What should be presented in a regression model?
-outcome should be stated
-All explanatory variables in the model should be listed as well as their
-Unit of increase (for continuous) ex age= 1 year
-Referent category (for dichotomous/categorical variables): type of animal 0=dairy cow, 1= beef cow
For each explanatory variable:
-regression coefficient, and its standard error
-p-value and/or better yet, 95% confidence interval
-Model intercept (B0)
What are some examples of linear regressions for continuous outcomes?
Standard equation: Y= B0 + B1X1 +B2X2 + B3X3….
-allows estimation of the linear effect of an explanatory variable (X), on the outcome (Y), after controlling for the potential confounding effects of the other variables in the model
-So B1 represents the average range in Y for a unit change in X1 when the other explanatory variables (X2,X3 etc) are held constant
What are some examples of logistic regressions for dichotomous outcomes?
Standard equation: Y= B0 + B1X1 + B2X2 + B3X3 etc….
-Note Y is the natural log of the odds of outcome so e^coefficient = odds ratio of the outcome for a unit increase in the explanatory variable, keeping the values of the other variables in the model constant