Module 6 Flashcards
Dummy coding
Most straightforward approach to including an independent variable with J categories in a linear model is to use dummy coding
Dummy coding - independent variable with J levels is broken down into J-1 separate binary dummy variables, each of which is coded to equal either 0 or 1
Reference category
one of the J levels of the IV is chosen as the reference category - reference category is assigned a value of 0 on each of the dummy variables
control is usually the reference category
what do we want our statistical model to do?
represent how variation in the dependent variable is a function of the 5 category independent variable
Slope parameter of dummy variable
Each dummy variable has its own slope parameter - difference between mean of reference category and mean of other category
B1d1+B2d2+B3d3
B0 dummy variable
Population mean of the reference categorry
Error term dummy variable represents what
Because not every participant in the reference category has the same value as the population mean
Confidence intervals around each dummy variable represent what?
Because the slope coefficient estimate of the d1 variable is a point estimate of the difference between the population mean of the rhyming condition and the population mean of the counting condition, then the interval [-2.90, 2.70] captures the population mean difference with 95% confidence
How to calculate estimated error term
subtract predicted mean from actual value
= score - mean of that group
What does i index
participant ID
How to calculate SS residual
get residual for each participant (score - predicted), square it and add them all up
How to calculate SS model
Predicted mean of the group - sample mean of the dependent variable (grand mean), squared, summed
What is SS model
variability in the model
R^2
= eta squared = 1-(SSredid/SStotal) = SSmodel/SStotal
If null is true what should the linear slope parameters be equal to?
Eachother and 0
B1=B2=B3=0
MS model
SS Model/df model
MS Residual
SS residual/df residual
F statistic
MS model/MS residual
APA style one way anova
“The overall proportion of variance explained by the linear model, R2 = .45, was significant, F (4, 45) = 9.09, p < .001, indicating that the number of words recalled significantly varied across the five conditions representing different levels of depth of processing
Significant result on ANOVA says what?
only indicates that at least one population mean is unlikely to be unequal to the
other population means.
Planned comparisons
t-tests are valid if a researcher has explicitly and transparently planned at the beginning of the study to compare the mean of the reference category with the means of the other categories.
Planned comparisons reporting
“Because the dummy variables in the linear model were defined a priori, the corresponding ttests represent planned comparisons. The rhyming mean (M = 6.90) did not significantly differ from the counting mean (M = 6.90), t (45) = 0.07, p = .94. But the adjective mean (M = 11.00) was significantly different from the counting mean, t (45) = 2.88, p = .006.”
Etc. for the t-tests for the remaining dummy variables
ANOVA model assumptions
- Independent observations
- Normally distributed errors
- Homogeneity of variance
How to assess assumption of normally distributed errors
examine the residuals from the estimated model
- strong kurtosis would violate this error
How to evaluate the homogeneity of variance assumption
Look at sample standard deviations
Dont want any to be like twice another one
but equal sample sizes may protect against unequal SDs