Week 6 - Regression/ANCOVA Flashcards
o Explain the key distinction between experimental designs and correlational designs (x2, x3)
Experimental:
o Determines causation through manipulation of IVs on randomly assigned Ps in controlled setting, assessing effect on DV
o BUT some factors are impossible or unethical to manipulate (e.g. personality, brain damage, long-term stress)
Correlational:
o Measures IVs (predictors) and assesses level of association with outcome / DV (criterion)
o Uses bivariate regression (1 predictor) or multiple regression (> 1 predictor)
• Using the terms predictor and criterion is controversial
o Explain the distinction between design issues and statistical issues, and why the two should not be confused (x7)
Don’t confuse statistical (ANOVA vs regression) and design (correlational vs experimental) issues
o Experimental just means random assignment
Can use ANOVA in observed (not random), eg gender, hi/med/low
o Theory would be only justification for implying causality
In experimental, we get statistical reason as well as theoretical
o Regression can be used to examine experimental data too
It’s design, not stats method that tells us correlational/experimental
o Define covariance (in words) (x2), and explain its main limitations (x1)
Average cross-product of the deviation scores
Sum of: (X - meanX)(Y - meanY)
Scale dependent, so no comparison with covariances on other scales
o Define correlation in relation to covariance (in words) (x3) and explain how it addresses the limitations of covariance (x1)
Covariance, divided by the product of SDs of X and Y
Expresses relationship between two variables in standard deviations
r = -1 to 1
Standardised, so comparable
o List the various terms used to indicate a bivariate correlation (x2)
Pearson correlation
zero-order correlation
o Define the coefficient of determination in the context of bivariate correlation (in
words) (x1)
And identify the letter used to indicate it
o Proportion of variance in one variable that is explained by the variance in another
r
o Explain the relationship between the coefficient of determination and error/residual variance (x1)
Error = 1 - r2
o Explain what question is being addressed when we test r for significance, (x1)
And which statistical test is used for this purpose ?
Is r large enough to conclude that there is a non-zero correlation in the population?
t = systematic variance divided by error variance
o Explain the difference between r and r-adjusted (x6)
r is sample statistic and is biased to the sample
o Like eta-squared
rho, ρ, is inferred population correlation coefficient – Estimated by the “adjusted r”, r-adj
o Like omega-squared
r-adj is always smaller than r (more conservative) as omega is to eta
Diff between r and r-adj gets smaller with increasing sample size
o Explain the relationship between bivariate correlation and bivariate regression (x5)
Use correlation to estimate score on one variable (Y, criterion) on the basis of scores on another (X, predictor)
Regression of Y on X implies that X is the IV - counterintuitive so make sure to remember
Objective is to find best fitting line on the scatter plot
o This represents best linear model of the data
o Explain what the various components of the bivariate regression equation represent (x8, for 4 components)
Ŷ = bX + a
o Equation says that our observed scores y are estimated by y-hat
Ŷ = predicted value of Y (dependent variable)
b = slope of regression line
o Change in Y associated with a 1-unit change in X
o Rise over run
X = value of predictor (independent variable)
a = intercept (value of Y when X = 0)
o Explain the relationship between b and (beta) (x5)
b = covariance over the variance of X, or r times SDy over SDx
*Unstandardised
If the data were standardised: Sx = Sy = 1
o COVXY = rXY = b
o b would become standardised regression coefficient, β (beta)
β = Z-score change in Y predicted from a 1 standard deviation increase in X
o Explain what the least squares criterion is (in bivariate regression) (x4)
Regression lines are fitted to it:
o Such that Σ(Y - Ŷ)2 is a minimum
o i.e., errors of prediction are a minimum
o ei = Yi – Ŷi = errors of prediction
o Explain what the standard error of the estimate is (in words) (x3)
Average deviations from the regression line
Average error of prediction
The root of: sum(Y - Y-hat)2
divided by N - 2
What does the standard error tell us (in bivariate regression) (x8)
As SD tells us proportion of scores under sections of normal curve, so does standard error of the estimate
o As long as assumptions of regression are met
Bigger rxy = smaller Sy.x
o A high correlation between X and Y…
• Reduces standard error of estimate
• Enhances accuracy of prediction
r2 is overly liberal (inflated) with small samples
o Thus, we also find that Sy.x is underestimated for small samples
o Explain what question is being addressed when we test the regression slope (i.e., b or ) for significacet (x1), and which test is used for this purpose (x1)
Sig. regression coefficient = a slope that significantly differs from zero ( + or - )
t-test, with df = N -2
o Explain what SSY, SSregression, and SSresidual represent
SSy = total variance in y (sum of the next two...) SSregression = variance due to the prediction/relationship SSresidual = variance due to error