Week 6 - Regression/ANCOVA Flashcards

1
Q

o Explain the key distinction between experimental designs and correlational designs 
(x2, x3)

A

Experimental:
o Determines causation through manipulation of IVs on randomly assigned Ps in controlled setting, assessing effect on DV
o BUT some factors are impossible or unethical to manipulate (e.g. personality, brain damage, long-term stress)
Correlational:
o Measures IVs (predictors) and assesses level of association with outcome / DV (criterion)
o Uses bivariate regression (1 predictor) or multiple regression (> 1 predictor)
• Using the terms predictor and criterion is controversial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

o Explain the distinction between design issues and statistical issues, and why the two 
should not be confused (x7)

A

Don’t confuse statistical (ANOVA vs regression) and design (correlational vs experimental) issues
o Experimental just means random assignment
Can use ANOVA in observed (not random), eg gender, hi/med/low
o Theory would be only justification for implying causality
In experimental, we get statistical reason as well as theoretical
o Regression can be used to examine experimental data too
It’s design, not stats method that tells us correlational/experimental

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

o Define covariance (in words) (x2), and explain its main limitations 
(x1)

A

Average cross-product of the deviation scores
Sum of: (X - meanX)(Y - meanY)

Scale dependent, so no comparison with covariances on other scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

o Define correlation in relation to covariance (in words) (x3) and explain how it addresses 
the limitations of covariance 
(x1)

A

Covariance, divided by the product of SDs of X and Y
Expresses relationship between two variables in standard deviations
r = -1 to 1
Standardised, so comparable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

o List the various terms used to indicate a bivariate correlation 
(x2)

A

Pearson correlation

zero-order correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

o Define the coefficient of determination in the context of bivariate correlation (in 
words) (x1)
And identify the letter used to indicate it 


A

o Proportion of variance in one variable that is explained by the variance in another
r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

o Explain the relationship between the coefficient of determination and error/residual 
variance 
(x1)

A

Error = 1 - r2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

o Explain what question is being addressed when we test r for significance, (x1)
And which statistical test is used for this purpose ?

A

Is r large enough to conclude that there is a non-zero correlation in the population?
t = systematic variance divided by error variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

o Explain the difference between r and r-adjusted 
(x6)

A

r is sample statistic and is biased to the sample
o Like eta-squared
rho, ρ, is inferred population correlation coefficient – Estimated by the “adjusted r”, r-adj
o Like omega-squared
r-adj is always smaller than r (more conservative) as omega is to eta
Diff between r and r-adj gets smaller with increasing sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

o Explain the relationship between bivariate correlation and bivariate regression 
(x5)

A

Use correlation to estimate score on one variable (Y, criterion) on the basis of scores on another (X, predictor)
Regression of Y on X implies that X is the IV - counterintuitive so make sure to remember
Objective is to find best fitting line on the scatter plot
o This represents best linear model of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

o Explain what the various components of the bivariate regression equation represent (x8, for 4 components)


A

Ŷ = bX + a
o Equation says that our observed scores y are estimated by y-hat
Ŷ = predicted value of Y (dependent variable)
b = slope of regression line
o Change in Y associated with a 1-unit change in X
o Rise over run
X = value of predictor (independent variable)
a = intercept (value of Y when X = 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

o Explain the relationship between b and  (beta) 
(x5)

A

b = covariance over the variance of X, or r times SDy over SDx
*Unstandardised
If the data were standardised: Sx = Sy = 1
o COVXY = rXY = b
o b would become standardised regression coefficient, β (beta)
β = Z-score change in Y predicted from a 1 standard deviation increase in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

o Explain what the least squares criterion is (in bivariate regression) 
(x4)

A

Regression lines are fitted to it:
o Such that Σ(Y - Ŷ)2 is a minimum
o i.e., errors of prediction are a minimum
o ei = Yi – Ŷi = errors of prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

o Explain what the standard error of the estimate is (in words) (x3)

A

Average deviations from the regression line
Average error of prediction
The root of: sum(Y - Y-hat)2
divided by N - 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the standard error tell us (in 
bivariate regression) (x8)


A

As SD tells us proportion of scores under sections of normal curve, so does standard error of the estimate
o As long as assumptions of regression are met
Bigger rxy = smaller Sy.x
o A high correlation between X and Y…
• Reduces standard error of estimate
• Enhances accuracy of prediction
r2 is overly liberal (inflated) with small samples
o Thus, we also find that Sy.x is underestimated for small samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

o Explain what question is being addressed when we test the regression slope 
(i.e., b or ) for significacet (x1), and which test is used for this purpose 
(x1)

A

Sig. regression coefficient = a slope that significantly differs from zero ( + or - )
t-test, with df = N -2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

o Explain what SSY, SSregression, and SSresidual represent 


A
SSy = total variance in y (sum of the next two...)
SSregression = variance due to the prediction/relationship
SSresidual = variance due to error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

o Explain how the F ratio is calculated in regression (x3) and what it question it is used to 
test 


A

MSregression, divided by MSresidual
Where:
MSregression = SSregression divided by df-regression (#predictors - 1)
MSresidual = SSresidual divided by df-residual (N - p - 1
Tests hypothesis that model accounts for significant variance in the DV (Ho: R2 = 0)

19
Q

o State what “ANCOVA” stands for 
(x1)

A

Analysis of covariance

20
Q

o Define covariate (in very simple terms) 
(x2)

A

A 2nd IV that you know will explain additional variance (based on previous research)
• Adding this control/concomitant variable changes the design of your study (now a 2-way factorial, not regression)

21
Q

o List the key similarity between ANCOVA and blocking 
(x1)

A

Both try to account for additional systematic variance from the error term (ie remove it), thereby increasing power

22
Q

o List the key difference between ANCOVA and blocking 
(x2, x2)

A

Blocking works at design level - including a known (categorical) DV correlate to reduce variance in DV
ANCOVA adjusts error term statistically with continuous variable - not matched to Ps at discrete levels

23
Q

o Identify the kinds of variables that can serve as covariates 
(x1, plus explain x1))

A

Continuous (rather than categories needed in blocking/ANOVA)

24
Q

o Explain how ANCOVA reduces error variance 
(x3)

A

By measuring another variable and estimating its parameters - partitioning out ‘unexplained’ variance in design
o If the variable affects the DV and is not part of statistical model for ANCOVA, it is in the unmeasured ‘error’…
If covariate unrelated you lose DF (lose power) without compensatory reduction in error

25
Q

o How does error reduction in ANCOVA increase statistical power for the test of the focal IV (i.e. the 
one we really care about in the study) 


A

Decrease in SSerror

So smaller denominator in F-test of significance

26
Q

o When ANCOVA adjusts the treatment means:


• explain why we would want to do this
 (x2)

A

Because you can have a confound in the covariate differs between levels of focal IV -
We care about treatment means at different focal levels, NOT covariate

27
Q

o When ANCOVA adjusts the treatment means:


• explain what research question is being addressed (x2)

A

ANCOVA teases apart effects of covariate and the IV by asking:
“would focal IV have an effect on DV if all Ps were equivalent on the covariate?”

28
Q

o When ANCOVA adjusts the treatment means:


• explain how ANCOVA does this (x5)

A

Calculate overall covariate sample mean.
o Assume this is (unconfounded) population mean.
Thus, for sample, if group’s covariate mean is different to overall covariate mean, that is a confound.
Adjust group’s “expected” mean on DV to be what it would be if group’s covariate/overall mean were same, by using the regression line
Test group main effects using adjusted means

29
Q

o In the F table for an ANCOVA:


• Explain what a significant effect of the covariate means

A

You’ve chosen a good one

30
Q

o In the F table for an ANCOVA:

• Explain what the likely implications are for a significant covariate on the test of the focal IV (i.e., the variable we care about) (x1)

A

Reduced error term and increased power

31
Q

o In the F table for an ANCOVA:


• Explain what a non-significant effect of the covariate means (x1)

A

It’s not related significantly to the DV, so is a rubbish covariate

32
Q

o In the F table for an ANCOVA:

• Explain the implications of a non-significant effect of the covariate means are for the test of the focal IV (i.e. the one we care about) (x1)

A

Won’t reduce error/increase chances of finding true treatment effect

33
Q

o Identify the assumptions of ANCOVA 
(x6, plus explain where necessary)

A

Regular ANOVA assumptions:
o Homogeneous variance
o Normal distribution
o Independence of errors
Plus:
o Covariate/DV relationship is linear (non-linear relationships degrade power)
o Covariate/DV is linear within each group
o DV/covariate relationship equal across treatment groups - homogeneity of regression slopes
• Ie, the lines are parallel - as treatment means adjusted based on average within-cell regression coefficient

34
Q

• What if there is an interaction between the covariate and the focal IV in ANCOVA?

A

a

35
Q

In bivariate regression, beta equals… (x1)

A

The correlation coefficient

36
Q

If X is unknown, what is the best predictor of Y? (x1)

And error is? (x1)

A

Y-bar

Any deviation from the average

37
Q

If X is known, what is the best (conditional) predictor of Y? (x1)
And error is? (x1)

A

Y-hat
Deviations from the regression line -
The standard error of the estimate

38
Q

How do you calculate df for F-test in bivariate regression? (x3)

A
df-y = N - 1
df-regression = p (#predictors)
df-residual = N - p - 1
39
Q

How does ANCOVA adjust the treatment means? (think about the graph…) (x3)

A

Mean for each group slides along the regression line
Until it meets the overall covariate mean
Thus spreading the DV distributions/increasing the treatment effects

40
Q

Describe the logic of ANCOVA (x6)

A

Adjusted treatment means assume that covariate means are same at each level of te focal IV
• Thus, diffs in adjusted treatment means attributed to focal IV only
Refines error term by subtracting predictable variation from covariate
o Larger adjustment when covariate-DV relationship is strong
Refines treatment effect to adjust for systematic group diffs on covariate that existed before experimental treatment

41
Q

What are two uses of ANCOVA? (x2)

A

To control unwanted variation that would otherwise inflate error with which we test models (classical usage)
To control for group diffs, esp. in analysis of clinical trials or other pre/post designs (controversial)

42
Q

What is the likely order of least to most significant results if testing same data with ANCOVA, ANOVA or blocking?
Why? (x1)

A

1-way ANOVA
Blocking
1-way ANCOVA

Because of decreasing error term in each test

43
Q

What are the advantages of blocking (x2) vs ANCOVA (x3)?

But restrictive … in ANCOVA? (x1)

A
•	Blocking
o	Conceptually simpler
o	Requires fewer assumptions
•	ANCOVA
o	Easier to administer
o	Can use continuous covariate
o	Removes effect from error term and DV
44
Q

In what 2 situations could you use ANCOVA?

A

o Covariate related to IV and DV (confound)

o Covariate related to DV only