2. Multiple Regression Flashcards
What is multiple regression?
Multiple Predictors (likely to correlate) = Finds optimal prediction of the outcome from several predictors (takes into account redundancy with one another)
What is multiple regression used for?
Prediction (leads to improved prediction)
Theory Testing (Theories often suggest multiple variables)
Covariate Control (If we want to assess the effect of a specific predictor + controlling the influence of others)
How would we interpret the intercept in multiple regression?
Predicted value for y when x are all 0
How would we interpret a coefficient in multiple regression?
Change in y for one unit change in x, when all other x’s remain constant
What does ‘holding constant’ mean?
Holding effect of x2
Controlling for difference in x2
Partialling out effects for x2
Holding X2 equal
Accounting for effects x2
Can identify the unique contribution of predictors as it means finding effect of predictor when other predictions are fixed
What are model predicted values and how do they help us?
Model predicted values are the predictions based on the the data/predicted from regression analysis
Can be compared to actual value (obtained by observation) - the difference being the residual
What are marginal distributions and how do we use them?
Distribution of each variable without reference to values of other variables (independent of one another)
Plot each variable individually via density plot and histogram
Numerically explore via summary stats (e.g. mean, SD)
What are bivariate associations and what are they used for?
Describes the relationship between two numeric variables
Plot associations among two variables
Numerically explore by reporting correlation
Fill in the blank:
The values of the intercept and slope that _______ are our estimated coefficients from our data
Minimise the sum of square residuals
How do you calculate the intercept?
Predicted B0 = Mean of y - predicted mean of x
How do you calculate the slope?
Sum of cross products/Sum of square deviations
sum(xi - mean x)(yi - mean of y)
divided by
sum(xi (individual PPT) - mean of x) squared