Quiz 4 Flashcards
Linear regression vs logistic regression
Linear regression assesses strength of relationship between 2+ continuous variables
Logistic regression produces an odds ratio for a categorical dependent variable and a continuous independent variable
Uses of linear regression
Can be used for predictions (regression line attempts to predict relationship) and for identifying confounders through comparison of simple and multiple linear regression
What is the null hypothesis in linear regression?
Null hypothesis is that B1 slope = 0
P value < 0.05 rejects H0 and signifies significance of association
Multilinear regression answers what question?
Which independent variables are predictors of the dependent variable?
r value vs R^2 value
r value = strength of correlation (0-1) between 2 independent variables
R^2 = coefficient of determination, the variance of dependent variable which can be explained by independent variable which reflects overall model and is given as percent
Linear regression outputs
Model summary: R^2 value
ANOVA table: sum of squares, df, mean square, F and p value
Parameter estimates (coefficients): unstandardized, standardized
Assumptions for linear regression (5)
Residuals are normally distributed
Linear relationships between variables in question
No extreme outliers
Minimum 10 cases per independent variable
No multicollinearity between independent variables (r > 0.7)
What are confounders?
Confounders are variables related to both exposure and outcome which distorts (stronger or weaker) estimate of predictor-outcome association
Visualized: directed acyclic graphs (DAGs)
Can be controlled if measured during data collection
How to identify a confounder: conduct linear regression with and without confounder → if p value was significant without confounder or β coefficient changes >10% → it is a confounder
Uses for logistic regression
case control, cross-sectional, and prospective cohort studies to measure prevalence at one time point
Odds ratio vs relative risk in logistic regression
Odds ratio: odds that exposed has outcome / odds that unexposed has outcome
(a/b)/(c/d)
Relative risk: incidence of outcome among exposed / incidence of out among unexposed
[a/(a + b)] / [c/(c+d)]
SPSS output for logistic regression
Exp(B) odds ratio of outcome
B value less relevant
Logistic regression modelling for covariate assessment
Either:
Only include significant predictors
Or choose covariates based on literature whether they are significant predictors or not (limits hypothesis testing)