MR Chapter 2 Flashcards

1
Q

How do you start a MR question?

A
  1. First look at the data: - range of points - means of these frequencies are in the ‘statistics’ table above. Are they reasonable for the context?
    • Next look at the regression. Simultaneous?
    • Note the correlations. (The correlation between Homework and grades is only slightly higher than Math HW and Achievement in chapter 1. (.327). Parent edu, however, is correlated with both time spent on HW (.277) and GPA (.294). )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the Multiple Regression coefficient?

A

Capital R (sometimes mult R)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is R2 interpreted?

A

R2 (.152) shows that the two explanatory variables together account for 15.2% of the variance in student’s GPAs. - You cannot just add the sum of the correlation coefficients which is why the figure is not larger, (R2 ≠ r2ParEd. GPA + r2 HW.GPA. - This is because the two explanatory variables are also correlated with each other. Parent edu accounts for HW.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we know if MR is statistically significant?

A

The ANOVA table tells us. This means that taken together in some optimally weighted combination, HW and PE levels predict or explain student’s GPA to a statistically significant degree.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is formula for Df residual ?

A

N-K (k=total predictor variables)-1 (97).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is MR different to Simple Regression?

A
  • With simple regression there was one b, and its probability was that same as that of the overall regression equation. The corresponding ß was equal to that of the original correlation.

With MR, each IV has its own regression coefficient; b for Parent Education is .871 and time spent on HW is .988. The regression equation is Y = 63.227 + .871X1+.988X2+error, or for grades: Grades(predicted) = 63.227+.871ParEd+.988HWork.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do we test Statistical Significance for MR?

A
  • We test each IV for statistical significance separately
  • Note that t (t=b/seb) for ParEd is 2.266 (p=.026), and the 95%CI for the b is .108-1.633.
  • The range does not include zero, so with alpha .05, ParEd is a statistically significant predictor of GPA.
  • For each additional year of Parent Edu, GPA rises by .871. (controlling for HW).
  • What about HW? B = .988 so close to 1 point for each hour of homework (controlling for parent GPA). Also stat sig P=.007
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we know which IV is the most effective on the DV in MR?

A
  • If we want to compare hours of homework and years of parent schooling, you need to standardise them and compare the ßs, the standardised regression coefficients. The higher ß is the more effective variable. ßHW=.266, ßPE=.220, so yes homework is better.
  • Each SD increase in HW will lead to .266 of a SD increase in grades. We will postpone asking whether this difference between the SDs of HW and PE is statistically significant.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Remember, HW can be manipulated, PE cannot, is it nonexperimental?

A

FIX CARD when Sabina gets back to you via email

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Possoble statistical interpretations for Multiple Regression on Pared and HW

A
    1. What research was designed to determine
      1. What was regressed on what (DV, GPA was regressed on first IV and second IV).
      2. The overall multiple regression was statistically significant or not (R2=…, F[2,97]=…, p<.001),
      3. and the two variables (Homework and Parent Education) accounted for 15% of the variance on Grades.
      4. The unstandardized regression coefficient (b) for Parent Education was .871 (t[97] = 2.266, p=.026), meaning that for each addititional year of schooling…. Etc etc
      5. Of more direct interest was the b associated with time spent on Homework (b=.988, t[97]= 2.737,p=.007). This finding suggests …
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do we mean by Controlling for…

A
  • we add these clarifications to indicate we have taken into account variables other than the single predictor and single outcome being interpreted and thus differentiate this interpretation from one focused on zero order correlations or simple, bivariate regression coefficients.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Will the simple regression coefficient and the MR coefficient be the same?

A
  • The simple regression coefficient and the MR coefficient will rarely be the same
  • the MR one will be smaller often.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does ‘within levels’ mean?

A

Same as ‘controlling for’

  • “Grade-point average will increase by .988 points, within levels of parent education”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the probblem with controlling for variables?

A

Controlling for assumes we have included the right variables to control for, which is hard

  • But all in all MR is better than simple reg because we can explain the phenomenon and DV more completely.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Partial correlations?

A
  • thought of as the correlation between two variables with the effects of another variable controlled, or removed, or “partialed” out.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Can you partial out multiple IVs?

A

Yes, can have several control variables calculating, for example, the partial correlation of homework and grades while controlling for both Parent edu and previous achievement.

17
Q

What is Semipartial Correlation?

A
  • aka Part Correlation
  • It is possible to as well. Eg the correlation of HW (with parent edu being controlled) with grades. Parent edu is removed from the HW variable, not the grades.
18
Q

B versus ß: USES OF B

A
  1. When the variables are measured in a meaningful metric

meaningful metric

HW measured in hours per week, 100 point scale. ‘each hour of HW increases the GPA by .988 points.’

non meaningful metric

Pared in years is not common, so would see 1=completed year 12, 2=completed a degree etc. So the 1 point increase in GPA for each year of Pared doesn’t work.

  1. To develop intervention or policy implications
  2. To compare effects across samples or studies.
19
Q

When to Interpret ß:

A
  1. When variables are not measuerd in a meaningful metric
  2. To compare the relative effects of different predictors in the same sample
    e.g., ß 1=.44 and ß 2=.22, so, can we conclude that ß 1 is twice as strong as ß 2?
    Generally speaking we can, BUT must remember that (among other things) ß’s are affected by the variability of the variables that are included and NOT included in the model.
20
Q

How to make comparisons across samples with NHST:

Say we redo the HW research on high school students. Two estimates b = 1.143 for high school students and .988 for 8th graders. Are they significantly different?

A
  • Look at the 95% CI for the regression coefficient for the 8th graders:.272 and 1.704. The value of 1.143 (new value: high school students) falls within this (previous) range so we can confidently say this value is not significantly different from our previous estimate (.988).
  • Do a t-test: Is the current value different from some value other than zero? We change the formula to t = b-value/SEb. Value represents the other value to which we are comparing b. If t of 2 or greater is often significant
  • compare the two regression estimates, considering the standard errors of both. formula z= (b1-b2)/√(SE2b1+SE2b2) can be used to compare regression coefficients from two separate (independent) regression equations. It doesn’t matter which b goes first. It’s easiest to make the larger b1.
21
Q

How to directly caluculate ß

A

With only two IVs

  • ß1 = (ry1-ry2ry12)/1-r212 and ß2= (ry2-ry1r12)/1-r212

So we have ßgradesHW = (r grades*HW –r grades*Pared x r pared)

1-r2pared*HW

Notes= that the ß of HW depends in part on the correlation between grades and HW (simple regression) but also depends on the correlations between grades and Pared, and HW and Pared. So ß for HW is

ß = .327(rgrades*HW) - .277(r greades*pared x r pared*HW)

1-.2772

= .246/.923 = .267 (within errors of rounding the same as .266 SPSS gave us.

REMENBER TO TAKE OTHER VARIABLES INTO ACCOUNT/CONTROL FOR THEM/ DON’T JUST INTERPRET LIKE CORRELATIONS

22
Q

Ho wto calculate R2

A
  1. R2 = SSRegression/SSTotal
  2. To calculate R2 using ßs, R2y12 = ß1ry12ry2.
  3. To calculate R2 from the correlations, R2= r2y1+r2y2 – 2ry1ry2r12

1-r212

Note again, r2 is reduced by a certain extent, and this is related to the correlation between the two independent variables, r12.