Exam 2 Flashcards
Explain what is meant by the tolerance of a predictor (from Study Guide 1). Explain how the measure of tolerance is computed from a regression model with more than two predictors.
The term [1 – r^2 x1, x2, x3] for any predictor Xi is called the tolerance of the predictor, the proportion of predictor Xi that is independent of all other predictors.
When predictors overlap in their association with the criterion, what approach is used to apportion variance accounted for to individual predictors?
The approach used to apportion the variance accounted for by one versus another predictor is to specify an order of priority of the variables. We assign to the first variable all the overlap of that variable with the criterion. We assign to the second variable, its overlap with the criterion which is unique (not redundant with overlap of the first variable). The decision as to which variable comes first is made based on theory.
How can you use the approach you described above in question 2 to make theoretical arguments about the importance of particular predictors or sets of predictors, taking into account other predictors?
If I were interested in how a person’s beliefs about health determine his or her willingness to engage in a preventive health behavior, over and above some recommendation from their physician, I would make the first variable the physician’s recommendation, and the second variable the person’s health beliefs. I would be answering the question of the unique contribution of the psychological factor of health beliefs to health behavior, “with physician recommendation partialed out”, or “held constant”, or “taken into account” or the unique contribution of the psychological factor of health beliefs to health behavior, “taking physician recommendation into account”, or “over and above physician recommendation.” It is a stronger argument for psychology to say that psychological factors account for health protective behavior, above and beyond what the doctor recommends than to say that psychological factors account for health protective behavior without taking into account what physicians recommend.
What is measured by the squared semi-partial correlation of a predictor with the criterion?
The squared semi-partial correlation measures the proportion gain in prediction of the criterion by the addition of another predictor or predictors to a regression equation already containing at least one other predictor.
What is measured by the squared partial correlation of a predictor with the criterion?
The squared partial correlation is the proportion of variance not accounted for by the first predictor that is accounted for by the added predictor. Put another way, it is the proportion of the residual variance not accounted for by the first predictor that is accounted for by the second predictor.
Know how to compute the squared semi-partial correlation from two squared multiple correlations
Say you had r^2 y.123 and r^2 y.12. Then the squared semi-partial correlation of X3 with the criterion, over and above X1 and X2 is
r^2 y(3.12) = r^2 y.123 - r^2 y.12
where R^2Y(3.12) is the squared semipartial correlation of predictor X3 with the criterion above and beyond predictors X1 and X2. The specific subscript notation of Y(3.12) indicates that predictors X1 and X2 are partialed out of X3, but not partialed out of Y. Thus the squared semipartial correlation. R^2Y(3.12) is the squared correlation between Y and the part of X3 that is independent of X1 and X2 (i.e., does not overlap with X1 and X2).
Know how to compute the squared partial correlation from two squared multiple correlations
r^2 y3.12 = (r^2 y.123 - r^2 y.12) / (1-r^2 y.12)
The subscript notation Y.12 denotes the proportion of variance accounted for by X1 and X2. (1-r^2y.12) is the proportion of variance not accounted for by X1 and X2. R^2y3.12 is the correlation between the part of Y with X1 and X2 partialed out and the part of X3 that also has X1 and X2 taken out.
Explain Horst’s definition of a suppressor variable
Variables classically termed SUPPRESSOR VARIABLES (Paul Horst, 1941) that are uncorrelated with the criterion but are correlated with other predictors; these variables increase the r2multiple when they are added to a regression equation. For these variables, the higher the absolute value of the correlation with the other predictor, the higher the multiple correlation. That is, the more they are correlated with another predictor, the better the overall prediction.
Will the regression coefficient for the suppressor be zero?
Regression weights will not be zero for a suppressor variable; they may be either negative or positive. The sign of regression weights for suppressors depends upon the sign of the validities of the other variables and the sign of the correlation between the predictors and the suppressor.
Explain the designation of Type I versus Type II partialing in SAS (sequential versus unique).
Type I sums of squares are generated sequentially as the effect of each predictor with all previously listed predictors on the model statement are partialed out.
Type II sums of squares are what we expect in multiple regression, the effect of each predictor with all other predictors partialed out.
What is a conditional distribution
distribution of a variable at one value of another variable
What is a conditional mean?
the mean of a variable at one value of another variable
What does it mean if a regression equation is “linear in the coefficients”.
Linearity in the coefficients means that the predicted score is a linear combination of the predictors, where the weights are regression coefficients.
The regression equation is in the form of a linear combination (weight times variable + weight times variable…):
= b1 X1 + b2 X2....bp Xp
What does it mean if a regression is “linear in the variables”?
Linearity in the variables means that the regression of Y on X is constant across the range of X. The conditional means fall on a straight line
What does additivity mean in a regression equation containing predictors X and Z and the criterion Y?
Additivity means that the relationship of one predictor to the criterion does not depend on the specific values of the other variables. It means that the regression of Y on X is the same at every value of Z, so that if you talk about the regression of Y on X with Z held constant, you do not have to indicate the specific value of Z.
What is the general form of a polynomial equation?
Polynomial equations contain a series of higher order functions of a single variable X. The polynomial of order p has (p-1) bends or inflections. 1st order has no bends (straight, linear line). 2nd order has one bend (parabola, quadratic).
How can a polynomial regression equation be used to test a prediction of a curvilinear relationship of a predictor to the criterion.
You can use a second order polynomial of a predictor. For instance, Yhat = b1X + b2X2 + b0. We would be testing whether the second order predictor adds significant predictability to the equation containing the first order predictor. If it does, then it lends support for the curvilinear hypothesis.
What do we mean by “higher” order terms in regression analysis? Give an example of a regression equation containing a higher order curvilinear term, a higher order interaction term.
Higher order terms are terms beyond 1st order (linear) polynomial regression. Higher order terms can be interactions, quadratic, cubic, quartic, etc. An example of a regression equation with a curvilinear term is: Yhat = b1X + b2X2 + b0. An example of an equation with an interaction term is: Yhat = b1X1 + b2X2 + b3X1X2 + b0