Multiple Correlation and Regression Flashcards
Multiple Regression and Multiple Correlation
Tools that can be used to examine combined relations between multiple predictors and a dependent variable
Expand on the ideas of correlation and bivariate regression to now include
multiple independent variables in the prediction of a single dependent variable; Major advantage to using multiple IVs is that, if we choose our variables wisely, we can increase the amount of information available for understanding and predicting the dependent (criterion) variable
Multiple Correlation Coefficient
Symbolized using a capital R to distinguish it from the bivariate Pearson r; Can only vary from 0 to 1.0 (as opposed to r -1.00 to +1.00)
What does R = 0 indicate?
That the IVs have no relationship with the DV/Criterion
What does R = 1.0 indicate?
That the X variables (IVs) have a perfect relationship with Y (DV/criterion)
The squared multiple correlation coefficient (R2)
Typically more useful than the multiple correlation coefficient (R) because we can interpret R2 values as the multivariate coefficient of determination; if between Y and the X variables is .50, we can interpret that as 50% of the variance in Y is accounted for (or shared with) the X variables
Partial Correlation Coefficient
Allows us to examine the relationship between Y and X1 after removing the influence of X2 from both Y and X1
Multiple Regression Formula
Yp= a + b1X1 + b2X2… (b1, b2, b3,…, bk are the slope coefficients that give weight to the IV/predictor variables according to their relative contributions to the prediction of Y; k represents the number of predictors; a is a constant that is similar to the Y-intercept in bivariate regression)
Multiple Regression
used to find the most satisfactory solution to the prediction of Y, which is the solution that produces the lowest standard error of the estimate (SEE); Each predictor variable is weighted so that the b values maximize the influence of each predictor in the overall equation
Two different approaches for developing MR equations
Let the computer do the work using a pre-existing algorithm (Forward Selection, Backward Elimination, Stepwise)
Specifically tell the computer how to construct the equation
(Hierarchical multiple regression-set up a hierarchical order for inclusion on the IVs/Predictors; Useful when some IVs are easier to measure than other, or if some are more acceptable to use than others)
Forward selection
A computer-generated method that starts with no variables in the model; X variables are then added one at a time into the regression equation (The bottom row displays the bivariate correlations between the DV and IV’s)
At Step #1, SPSS will select the X variable to be entered into the equation first by picking the variable with the highest Pearson’s r with the DV
So what variable do we anticipate will be selected first?
xxx
Analysis of Step #1 shows that the y-intercept (a) has a value of -27.92 Nm, and the slope of the coefficient (b) has a value of 2.18 Nm/kg
So prediction equation becomes: isokinetic torque (Nm) = -27.92 + 2.18 (weight)
xxx
isokinetic torque (Nm) = -27.92 + 2.18 (weight) At step #1, we do not yet have a MR equation because we only have 1 X variable However, we’ve accounted for ~86% of the variance in isokinetic torque with body weight as the only predictor, so ~14% of the variance is left unexplained
xxx
At step #2, the selection algorithm selects the variable that increases R2 the most (the one that adds the most unique variance to the equation)
So, after removing the effects of weight, how much variance in isokinetic torque is accounted for by fat-free weight, age, percent fat, etc.?
We find that it is Age that adds the most unique variance
xxx