Week 6 - Regression/ANCOVA Flashcards by Emma Richmond-Darvill

o Explain the key distinction between experimental designs and correlational designs  (x2, x3)

Experimental:
o Determines causation through manipulation of IVs on randomly assigned Ps in controlled setting, assessing effect on DV
o BUT some factors are impossible or unethical to manipulate (e.g. personality, brain damage, long-term stress)
Correlational:
o Measures IVs (predictors) and assesses level of association with outcome / DV (criterion)
o Uses bivariate regression (1 predictor) or multiple regression (> 1 predictor)
• Using the terms predictor and criterion is controversial

How well did you know this?

Not at all

Perfectly

o Explain the distinction between design issues and statistical issues, and why the two  should not be confused (x7)

Don’t confuse statistical (ANOVA vs regression) and design (correlational vs experimental) issues
o Experimental just means random assignment
Can use ANOVA in observed (not random), eg gender, hi/med/low
o Theory would be only justification for implying causality
In experimental, we get statistical reason as well as theoretical
o Regression can be used to examine experimental data too
It’s design, not stats method that tells us correlational/experimental

How well did you know this?

Not at all

Perfectly

o Define covariance (in words) (x2), and explain its main limitations  (x1)

Average cross-product of the deviation scores
Sum of: (X - meanX)(Y - meanY)

Scale dependent, so no comparison with covariances on other scales

How well did you know this?

Not at all

Perfectly

o Define correlation in relation to covariance (in words) (x3) and explain how it addresses  the limitations of covariance  (x1)

Covariance, divided by the product of SDs of X and Y
Expresses relationship between two variables in standard deviations
r = -1 to 1
Standardised, so comparable

How well did you know this?

Not at all

Perfectly

o List the various terms used to indicate a bivariate correlation  (x2)

Pearson correlation

zero-order correlation

How well did you know this?

Not at all

Perfectly

o Define the coefficient of determination in the context of bivariate correlation (in  words) (x1)
And identify the letter used to indicate it  

o Proportion of variance in one variable that is explained by the variance in another
r

How well did you know this?

Not at all

Perfectly

o Explain the relationship between the coefficient of determination and error/residual  variance  (x1)

Error = 1 - r2

How well did you know this?

Not at all

Perfectly

o Explain what question is being addressed when we test r for significance, (x1)
And which statistical test is used for this purpose ?

Is r large enough to conclude that there is a non-zero correlation in the population?
t = systematic variance divided by error variance

How well did you know this?

Not at all

Perfectly

o Explain the difference between r and r-adjusted  (x6)

r is sample statistic and is biased to the sample
o Like eta-squared
rho, ρ, is inferred population correlation coefficient – Estimated by the “adjusted r”, r-adj
o Like omega-squared
r-adj is always smaller than r (more conservative) as omega is to eta
Diff between r and r-adj gets smaller with increasing sample size

How well did you know this?

Not at all

Perfectly

o Explain the relationship between bivariate correlation and bivariate regression  (x5)

Use correlation to estimate score on one variable (Y, criterion) on the basis of scores on another (X, predictor)
Regression of Y on X implies that X is the IV - counterintuitive so make sure to remember
Objective is to find best fitting line on the scatter plot
o This represents best linear model of the data

How well did you know this?

Not at all

Perfectly

o Explain what the various components of the bivariate regression equation represent (x8, for 4 components) 

Ŷ = bX + a
o Equation says that our observed scores y are estimated by y-hat
Ŷ = predicted value of Y (dependent variable)
b = slope of regression line
o Change in Y associated with a 1-unit change in X
o Rise over run
X = value of predictor (independent variable)
a = intercept (value of Y when X = 0)

How well did you know this?

Not at all

Perfectly

o Explain the relationship between b and  (beta)  (x5)

b = covariance over the variance of X, or r times SDy over SDx
*Unstandardised
If the data were standardised: Sx = Sy = 1
o COVXY = rXY = b
o b would become standardised regression coefficient, β (beta)
β = Z-score change in Y predicted from a 1 standard deviation increase in X

How well did you know this?

Not at all

Perfectly

o Explain what the least squares criterion is (in bivariate regression)  (x4)

Regression lines are fitted to it:
o Such that Σ(Y - Ŷ)2 is a minimum
o i.e., errors of prediction are a minimum
o ei = Yi – Ŷi = errors of prediction

How well did you know this?

Not at all

Perfectly

o Explain what the standard error of the estimate is (in words) (x3)

Average deviations from the regression line
Average error of prediction
The root of: sum(Y - Y-hat)2
divided by N - 2

How well did you know this?

Not at all

Perfectly

What does the standard error tell us (in  bivariate regression) (x8) 

As SD tells us proportion of scores under sections of normal curve, so does standard error of the estimate
o As long as assumptions of regression are met
Bigger rxy = smaller Sy.x
o A high correlation between X and Y…
• Reduces standard error of estimate
• Enhances accuracy of prediction
r2 is overly liberal (inflated) with small samples
o Thus, we also find that Sy.x is underestimated for small samples

How well did you know this?

Not at all

Perfectly

o Explain what question is being addressed when we test the regression slope  (i.e., b or ) for significacet (x1), and which test is used for this purpose  (x1)

Sig. regression coefficient = a slope that significantly differs from zero ( + or - )
t-test, with df = N -2

How well did you know this?

Not at all

Perfectly

o Explain what SSY, SSregression, and SSresidual represent  

SSy = total variance in y (sum of the next two...)
SSregression = variance due to the prediction/relationship
SSresidual = variance due to error

How well did you know this?

Not at all

Perfectly

o Explain how the F ratio is calculated in regression (x3) and what it question it is used to  test  

Study These Flashcards

MSregression, divided by MSresidual
Where:
MSregression = SSregression divided by df-regression (#predictors - 1)
MSresidual = SSresidual divided by df-residual (N - p - 1
Tests hypothesis that model accounts for significant variance in the DV (Ho: R2 = 0)

o State what “ANCOVA” stands for  (x1)

Study These Flashcards

Analysis of covariance

o Define covariate (in very simple terms)  (x2)

Study These Flashcards

A 2nd IV that you know will explain additional variance (based on previous research)
• Adding this control/concomitant variable changes the design of your study (now a 2-way factorial, not regression)

o List the key similarity between ANCOVA and blocking  (x1)

Study These Flashcards

Both try to account for additional systematic variance from the error term (ie remove it), thereby increasing power

o List the key difference between ANCOVA and blocking  (x2, x2)

Study These Flashcards

Blocking works at design level - including a known (categorical) DV correlate to reduce variance in DV
ANCOVA adjusts error term statistically with continuous variable - not matched to Ps at discrete levels

o Identify the kinds of variables that can serve as covariates  (x1, plus explain x1))

Study These Flashcards

Continuous (rather than categories needed in blocking/ANOVA)

o Explain how ANCOVA reduces error variance  (x3)

Study These Flashcards

By measuring another variable and estimating its parameters - partitioning out ‘unexplained’ variance in design
o If the variable affects the DV and is not part of statistical model for ANCOVA, it is in the unmeasured ‘error’…
If covariate unrelated you lose DF (lose power) without compensatory reduction in error

o How does error reduction in ANCOVA increase statistical power for the test of the focal IV (i.e. the  one we really care about in the study)  

Decrease in SSerror | So smaller denominator in F-test of significance

o When ANCOVA adjusts the treatment means:  | • explain why we would want to do this  (x2)

Because you can have a confound in the covariate differs between levels of focal IV - We care about treatment means at different focal levels, NOT covariate

o When ANCOVA adjusts the treatment means:  | • explain what research question is being addressed (x2)

ANCOVA teases apart effects of covariate and the IV by asking: “would focal IV have an effect on DV if all Ps were equivalent on the covariate?”

o When ANCOVA adjusts the treatment means:  | • explain how ANCOVA does this (x5)

Calculate overall covariate sample mean. o Assume this is (unconfounded) population mean. Thus, for sample, if group’s covariate mean is different to overall covariate mean, that is a confound. Adjust group’s “expected” mean on DV to be what it would be if group’s covariate/overall mean were same, by using the regression line Test group main effects using adjusted means

o In the F table for an ANCOVA:  | • Explain what a significant effect of the covariate means

You've chosen a good one

o In the F table for an ANCOVA:  • Explain what the likely implications are for a significant covariate on the test of the focal IV (i.e., the variable we care about) (x1)

Reduced error term and increased power

o In the F table for an ANCOVA:  | • Explain what a non-significant effect of the covariate means (x1)

It's not related significantly to the DV, so is a rubbish covariate

o In the F table for an ANCOVA:  • Explain the implications of a non-significant effect of the covariate means are for the test of the focal IV (i.e. the one we care about) (x1)

Won't reduce error/increase chances of finding true treatment effect

o Identify the assumptions of ANCOVA  (x6, plus explain where necessary)

Regular ANOVA assumptions: o Homogeneous variance o Normal distribution o Independence of errors Plus: o Covariate/DV relationship is linear (non-linear relationships degrade power) o Covariate/DV is linear within each group o DV/covariate relationship equal across treatment groups - homogeneity of regression slopes • Ie, the lines are parallel - as treatment means adjusted based on average within-cell regression coefficient

• What if there is an interaction between the covariate and the focal IV in ANCOVA?

In bivariate regression, beta equals... (x1)

The correlation coefficient

If X is unknown, what is the best predictor of Y? (x1) | And error is? (x1)

Y-bar | Any deviation from the average

If X is known, what is the best (conditional) predictor of Y? (x1) And error is? (x1)

Y-hat Deviations from the regression line - The standard error of the estimate

How do you calculate df for F-test in bivariate regression? (x3)

``` df-y = N - 1 df-regression = p (#predictors) df-residual = N - p - 1 ```

How does ANCOVA adjust the treatment means? (think about the graph...) (x3)

Mean for each group slides along the regression line Until it meets the overall covariate mean Thus spreading the DV distributions/increasing the treatment effects

Describe the logic of ANCOVA (x6)

Adjusted treatment means assume that covariate means are same at each level of te focal IV • Thus, diffs in adjusted treatment means attributed to focal IV only Refines error term by subtracting predictable variation from covariate o Larger adjustment when covariate-DV relationship is strong Refines treatment effect to adjust for systematic group diffs on covariate that existed before experimental treatment

What are two uses of ANCOVA? (x2)

To control unwanted variation that would otherwise inflate error with which we test models (classical usage) To control for group diffs, esp. in analysis of clinical trials or other pre/post designs (controversial)

What is the likely order of least to most significant results if testing same data with ANCOVA, ANOVA or blocking? Why? (x1)

1-way ANOVA Blocking 1-way ANCOVA Because of decreasing error term in each test

What are the advantages of blocking (x2) vs ANCOVA (x3)? | But restrictive ... in ANCOVA? (x1)

``` • Blocking o Conceptually simpler o Requires fewer assumptions • ANCOVA o Easier to administer o Can use continuous covariate o Removes effect from error term and DV ```

In what 2 situations could you use ANCOVA?

o Covariate related to IV and DV (confound) | o Covariate related to DV only

Week 6 - Regression/ANCOVA Flashcards

(44 cards)