Exam 3 Flashcards
The overall Fobserved in an Analysis of Regression is F(4,203) = 5.89.
a. How many predictors are there in this regression analysis?
4
The overall Fobserved in an Analysis of Regression is F(4,203) = 5.89.
b. How many degrees of freedom are there for the t-test for the significance of each predictor?
206
Suppose the observed value of a t-test for a single regression coefficient. If t(39) = 3.00, for alpha=.05, two tailed. If the test of this regression coefficient were reported as an F test instead, what would be the numerical value of Fobserved? What would be the degrees of freedom of this F test?
F (1, 39) = 9.00
Suppose you have a categorical variable with four groups, e.g. four regions of the country. You wish to use it as a predictor in a regression analysis. What is the general strategy for employing a categorical variable as a predictor in a regression analysis.
We create g-1 code variables to represent g groups so that each grou phas its own code. We treat them by coding variables that carry the information about the group membership of each case. There are numerous coding schemes that we use. Each coding system carries all the group information and represents the same nominal variable.
In any coding scheme for g groups, how many codes are required to characterize the g groups?
G - 1
A dummy variable coding scheme with group 3 as the baseline group
Group # C1 C2
1 1 0
2 0 1
3 (base) 0 0
an unweighted effects coding scheme with group 3 as the baseline group. interpret coefficients:
Yhat = b1UE1 + b2UE2 + b0
Group # C1 C2 1 1 0 2 0 1 3 (base) -1 -1 b1=mean of group 1 minus grand mean b2=mean of group 2 minus grand mean
a series of orthogonal contrast codes. For example, be able to code these contrasts: contrast code that contrasts the mean of the first group with the average of the means of the second and third groups; a contrast code that contrasts the mean of the second group with the mean of the third group.
Group # C1 C2
1 -2 0
2 1 1
3 1 -1
or
Group # C1 C2
1 1 0
2 -.5 .5
3 -.5 .5
For dummy coding, be able to take the general regression equation and the codes and explain what each of the coefficients in the equation is measuring
B0 = value of the mean in the baseline group B1= the difference between the mean of groups in first contrast B2= the difference between the mean of groups in second contrast
What does it mean if two codes from a coding scheme are orthogonal? Under what condition will the following relationship hold:
r2multiple = r2y,c1 + r2y,c2
Orthogonal means that each code accounts for a portion of variance that does not overlap at all with the other codes in the set. The relationship above will only hold for equal sample sizes and orthogonal contrasts
Are dummy codes centered?
Dummy codes are not centered
Are the pairs of dummy codes in a dummy variable coding scheme orthogonal?
No, the dummy codes are correlated with one another. They share the same base group. They account for overlapping proportions of variance in Y. You can’t compute the correlation of each dummy code with the criterion.
What sort of data configuration lends itself to coding with dummy codes?
A configuration in which there is one definite control group or base group.
give the set of unweighted effects codes with group 3 as the baseline group
Group # C1 C2
1 1 0
2 0 1
3 (base) -1 -1
For unweighted effects coding, be able to take the general regression equation and the codes and explain what each of the coefficients in the equation is measuring if the groups are of equal size
The intercept = the grand mean of all group means
The regression coefficient for each unweighted effects code is the mean of the group coded 1 minus the grand mean.
In what way are unweighted effects codes, applied to equal group size data, intimately related to ANOV.
In ANOVA, all the contrasts that go into computing the SStreatment are of each group mean with the grand mean. Each unweighted effect code in regression provides a measure of the difference between a group mean and the grand mean.
Are unweighted effects codes centered for equal group size? for unequal group size?
Yes, unweighted effects codes are centered only if we are dealing with equal sample sizes. If we have unequal group sizes, they can be adjusted to weighted effects codes.
Are unweighted effects codes orthogonal?
No. Unweighted effects codes are not orthogonal because the sum of the squared validities does not equal r2multiple.
Return to thinking about dummy codes. Consider gender as a dummy coded varaible, 1=male, 0=female. Suppose you have the coefficients for the overall regression equation, where X is continuous and D is a dummy code:
Yhat= b1 X + b2 D + b3 XD + b0 Yhat= .4X + .3 D + .2XD + 1.5
Explain what each of the four coefficients (including the intercept) measure.
B0: The intercept for females (the intercept for the group coded zero)
B1: the regression of Y on X for females is .4 (the regression of Y on X in the group coded zero)
B2: the difference in intercepts for males minus females is .3 (the difference in the intercepts for the group coded 1 minus for the group coded 0)
B3: the difference in slopes for males minus females is .2 (the difference in the slopes for the group coded 1 minus for the group coded 0)
Be able to write the simple regression equation for the group coded zero (female) from the overall equation.
Yhat= .4X + .3 D + .2XD + 1.5
It would be = .4X + 1.5.
How would you get a simple regression for the group coded one (males)
The easiest thing would be to reverse the codes, so that 1=female, 0=male, and rerun the regression equation
In coding of a two-group variable, we considered the use of unweighted effect codes (+1,-1) versus the codes (+.5, -.5). In the equation
Yhat= b1 X + b2 C + b3 XC + b0 when using the contrast code versus
Yhat= b1 X + b2 UE + b3 X*UE + b0 when using the unweighted effects codes,
explain how the numerical values of coefficients will change when you switch coding systems. Will the significance of the coefficients change when you change between these two coding systems?
In the case of unweighted effects codes (+1, -1), the coefficient b2 and b3 will change. They will only get us halfway to the next code. Since the coefficients represent a one unit change, it will get us to the value of 0.
In the case of the contrast codes (+.5, -.5), the coefficients won’t change because they still represent a one unit change.
What is Type III partialing in SAS GLM, in SPSS GLM? Regression or “unique” partialing in SPSS MANOVA?
When we have unequal n’s, there are options for how we can analyze the data. We can use the “unique” partialing (in SPSS MANOVA) or Type III partialing (in SPSS GLM and SAS GLM) where each sums of squares is reported with all other effects partialed out.
SSA with B and AB partialed out
SSB with A and AB partialed out
SSAB with A and B partialed out.
Show the regression equation for a two-group experiment with a continuous variable included. In the analysis of covariance, what is the categorical variable called? the continuous control variables?
The categorical variables are called covariates or control variables. The continuous variable is referred to as the variate (coded variable)
Yhat = b1X1 + b2X2 … bpC + b0
C is the categorical variate
X is a covariate, continuous or categorical.
Why is there no correlation between the covariate and the variate in the true experiment? With what two things can the covariate be correlated in a quasi-experiment with nonrandom assignment?
There is no correlation between the covariate and the variate in the true experiment because random assignment should control for that. The covariate can be correlated to the DV. In a quasi-experiment, it’s okay for there to be a correlation between characteristics of the subject and assignment to the treatment. A covariate might be more or less correlated with the criterion Y. It comes about because subjects are not randomly assigned.
How is power for the test of treatment affected by the covariate in the true experiment (increase power)? How is power in the quasi-experiment affected by the covariate (may increase or decrease power depending on the relationship of the covariate to treatment and criterion)?
Power is increased because the covariate partials out from the criterion a source of variation that is irrelevant to the predictor, which increases the power of the test for an effect. The ANCOVA with a quasi-experiment might increase or decrease power for the test of the effect depending on the relationship of the covariate to treatment and criterion. It’s possible that the covariate partials out error variation in the criterion or it may partial out pre-existing b/w group differences on the criterion that should not be attributed to the treatment.
What is a within class regression line in the ANCOVA? What assumption is made about within class regression lines in analysis of covariance? If the assumption is met, is the treatment effect constant over all levels of the covariate?
Within class regression lines are regressions of the criterion on the covariate in each of the conditions. It assumes homogeneity of within class regression meaning that the b1s are equal in the two groups (the slopes are the same). If that assumption is met, the treatment is constant over all levels (no interaction).
Show an alternative arrangement of equation
Yhat = b1 X + b2 C + b3 XC + b0
into a simple regression equation that shows the regression of Y on C at different values of X. This is an arrangement that focuses on differences between the means of the two groups on the dependent variable as a function of X.
Yhat = b1 X + b2 C + b3 XC + b0
Yhat = b2 C + b3 XC + b1 X + b0
Yhat = (b2 + b3 X) C + (b1X + b0)
The simple regression coefficient here (b2 + b3 X) gives the value of the difference between the intercept of the group coded 1 minus the group coded 0, at each specific value of X.