Research Methods III Flashcards
Covariance
- Reflects the degree to which 2 variables vary together.
- Relationship between two continuous variables in original RAW (unstandardized) scale.
- Scale dependent.
Sum of Squares
Compute deviations for X & Y and E (cannot be a negative value).
Sum of Products
Compute product of xy deviations and E (error)
Correlation
Standardized (z-score) measure of linear relationship between 2 continuous variables.
- Standardized.
- Z-score.
- Scale invariant.
Fisher z-test
Testing 2 independent sample correlations.
Effect of r
- Pearson correlation
- Provides a measure of effect size due to being based on standardized scores.
+/- 1 = small effect, +/-3 = medium effect, +/-5 = large effect
Regression
The statistical technique to produce the best straight line to predict Y.
Regression equation
Yi = bo + biXi + Ei
Yi
Dependent or outcome variable, criterion variable
Xi
Independent variable, predictor variable
bi
Regression coefficient for the predictor.
Gradient (slope).
bo
y intercept
value of y when x = 0
Ei
the errors in prediction based on the regression
Assumptions of Regression:
Linearity
Based on linear correlations, assumes linear bivariate relationship between each x and y, and also between y and predicted y.
Assumptions of regression:
Normality
Normally distributed, both univariate and multivariate distributions of residuals.
Y scores are independent and normally distributed (Shapiro-Wilk)
Assumptions of regression:
Independence of scores
Independence of Y (outcome: DV) scores.
Assumptions of regression:
Independence of errors
Errors (residuals) from observations should not be correlated with each other (Durbin-Watson test)
Assumptions of regression: Minimal multicollinearity
Predictors (IVs) should not be highly correlated with each other.
No higher than r = .80 for predictors.
Want Variance Inflation Factor (VIF) to be less than 10.
Assumptions of regression:
Homoscedasticity
Variance of residuals are uniform for all values of Y (test with Levene’s test) assumed with the sample size is large. Cannot be assumed if sample size is small.
Ordinary Least Squares regression (OLS)
- Yields values for b-weights (regression coefficients) and the y-intercept that will result in the sum of the squared residuals being at the minimum (smallest).
Best fitting line = smallest total error.
Resulting regression line = least-square error solution.
B-Weights + y-intercept = SS Residuals at minimum
Partially standardized regression coefficient
Regression coefficient predicting Y from X.
Only standardized on x, not y.
The Regression Model & Sum of Squares
Squaring each of the deviations and summing across observations yields SS for each source of variability of y.
ANOVA to test the Regression Model:
Regression
Variability in y that can be explained by the predictor(s) – represents the component of Y that is shared with x1
ANOVA to test the Regression Model:
Residual
Variability in Y that cannot be explained by the predictor – simply what is ‘left over’ after accounting for X.
t-test to test the Regression coefficient
b-cofficients
unstandardized (raw) regression coefficients
t-test to test the Regression coefficient:
B(beta) standardized (z-score) regression coefficients
One standard deviation increase in X results in an expected change of beta standard deviation units in Y.
Increase in X, change in Y
Suppression
A second predictor variable (X2) that is unrelated to Y (dependent variable) raises the amount of variance explained by the first predictor by eliminating certain irrelevant aspects of the first predictor (X1).
X2 suppresses some of the “error” or “irrelevant” variance in X1.
Classical suppression
r = 0, but beta = 0
Surprising Suppression
beta > r
Surprising because the Beta values increased from both X1 and X2.
Partial Correlation
Correlation between 1 DV and 1 IV variable (Y and X1) with two or more variables (e.g., X2, X3) partialed from BOTH DV and the first IV variables.
Semi-partial Correlation
The part correlation has variable 2 ONLY partial out of predictor 1. It is the correlation of Y with that part of X1 which is independent of X2.
Coding
Regression framework is flexible.
Categorical or nominal independent variables can be used in Multiple Regression/Correlations.
Vectors
g - 1
Represent df of IVs
All coding systems come up with the same correlations, BUT they produce different regression equations.
Dummy Coding
Os and 1s
Representation of a variable consisting of g categories by creating g - 1 variables (vectors) for which each g - 1 categories is coded 1 on a single variable while the remaining categories are coded 0 on these variables.
Regression = …
ANOVA
Contrast Coding
Identification of specific comparisons of interest & assigning values that enable the treatments to be directly compared.
In a two group design, one group is assigned +1 and the other group is assigned -1.
Effects Coding
A method of coding categorical variables in which each group is compared to the weighted or unweighted mean of all the groups.
Effects Coding:
Unweighted effects coding
Used when you wat to compare the mean of a particular group with the grand mean, regardless of proportions in population or in sample.
Effects coding:
The group coded -1 is known as…
The reference group.
Effects coding:
The group coded 1 may be referred to as…
A coded group.
bo = ?
Grand mean across all groups.
b1 = ?
Difference between the group mean and the grand mean.
Dummy Coding b-weight
Indicate differences between conditions.
Contrast Coding b-weight
Used to calculate difference between conditions compared.
Effect Coding b-weight
Indicates difference between Ya & Yt.
Dummy Coding y-intercept
Y-bar when all X’s are 0; the mean of the reference group
Contrast Coding y-intercept
Y-bar; mean of all the groups
Effect Coding y-intercept
Y-bar T; mean of all the groups
Ordinal regression
When range is sufficient (i.e., approximately) > 6ish, treat as continuous; if not, trat as nominal.
Only need 1 DS if treated as continuous.
The DF is larger (k -1) if treated as nominal, which can reduce power.
Power of regression predictor type from highest to lowest:
Continuous, ordinal, nominal
Interaction
The effect of 1 IV on DV changes based on level of another IV.
If each factor has 2 levels (or 2 groups), only one vector is needed to differentiate the 2 factors, in Multiple Regression (MRC).
How to code for interaciton
The code for the interaction is simply the multiplication of Vector A and Vector B.
Correlation when there is no IV effect
Correlation is different from 0 when there is an IV effect because the means for both treatment groups are different.
R-squared for overall regression
Sum all the R-squareds for the main effect of A, the main effect of B and the interaction.
R-squared is analogous to Sum of Squares
SS is used for ANOVA and R-squared is analogous to Mean Square.
Is it a different or same result in MR (multiple regression) as in ANOVA?
Same!
Multicollinearity problem if you don’t center
Remove any shared variance between the interaction (A x B) variable & the independent variables that make up the interaction term.
Correlations for the variables X & Z with Y do not change when you center.
The individuals predictors’ correlations with the interaction….
…does change to 0 when you center.
Centering corrects for multicollinearity and…
…avoid accounting for parts of Y more than once.
By centering, the regression coefficient of the individuals predictors are…
…the main effect of the predictor.
In regression equations without interaction terms…
The y-intercepts are different bu the b-weights are exactly the same.
In regression equations with interaction terms…
The y-intercepts and b-weights are different.
Interpreting the plots for possible interactions between 2 continuous variables
If lines cross there is an interaction.
MR Formula:
Y = bo + b1 x X1 + b2 x X2 +b3 x X1 x X2 + ey
b1
Regression coefficient specific to when the value of predictor X2 = 0
b2
Regression coefficient specific to when the value of predictor X1 = 0
b3
The interaction’s predictive effect.
Describes how b1 and b2 change as a function of X2 and X1.
The value of b1 changes by b3 units….
For every one unit increase in X2.
The value of b2 changes by b3 units…
For every one unit increase in X1.
Interpreting the coefficient for the interaction term (b3):
The tobs statistic for b3 provides…
A NHT (null hypothesis test) for X1 & X2.
Interpreting the coefficient for the interaction term (b3):
If p(tobs)
Reject Ho & conclude the magnitude of b1 depends on the level of X2 and that b2 depends on X1.
Interpreting the coefficient for the interaction term (b3):
If p(tobs) > .05, we…
Fail to reject Ho & conclude the magnitudes of b1 & b3 are constant across all values of X2 & X1 – usually droped x1 x x2 to improve precision.