Test 2 Modules Flashcards
Repeated Measures ANOVA
Comparing more than two groups when the
IV is manipulated within subjects.
F Ratio: MSbetween/Mswithin
If between is bigger than within the f ratio
should be big enough to reject null
hypothesis.
If Sphericity tests violated (significant) than we…
use a a greenhouse-geisser correction.
If the standard error bars overlap significantly in a estimated marginal means plot than this indicates that…
If the standard error bars overlap significantly indicate that the two group means are not likely to be statistically significant.
Mean Square and Sums of Squares from the output
Mean Square Within = sums of squares within/df within = 32.28/22 = 1.47
Mean Square Between = sums of squares between/df between = 22.39/2 = 11.19
F Ratio = MS Between/Mean Square Within = 11.19/1.47 = 7.63
The residual is the MSwithin.
sums of squares for the extraneous variability.
11 df because (N =12; N – 1 = 11; one group of participants in within-subjects design)
Despite the variability amongst participants people who tended to do well in one condition tended to do well in the others (still an effect of condition despite some people having better memory than others).
Because we are doing a repeated measures ANOVA and not a One Way ANOVA I can section out the variability in the data due to subject difference (rather than sampling etc.) and throw it out because we know where it comes from. This will mean the residual will be smaller and we are more likely to get a bigger F and detect smaller differences between conditions!
SSwithin/residual =SStotal – SSbetween – Between Subjects Effects
This means that the MSresidual is SSwithin/df = small number (i.e., 1.47)
More powerful design and controls for extraneous variables by throwing out the overall variability between subjects we do not care about and allows us to focus on variability between groups explained by our IV.
Degrees of Freedom:
o 3 conditions (k)
o 12 participants (n)
o 36 observations (k x n =12x 3 = 36)
o 35 degrees of freedom total
(observations – 1)
o 2 for my IV (# of groups – 1 = 3-1 = 2)
o 11 for my between subjects effect (n-1; 12-
1 =11)
o 22 left over for my residual (totadfl-IVdf-
BSdf = 35-2-11= 22)
This ONLY tells me that there is a significant difference between our conditions it doesn’t tell me which groups differ (higher/lower mean than the others)
o Solution:
o Run a post hoc test i.e., tukey (no real
hypothesis about the direction of the
difference and we need to correct for
the multiple comparisons we are
making; 3 groups).
o We are looking at the tukey corrected p-
value!
o We can see that only the auditory and
visual conditions differ from the
combined condition.
the reason we do post hocs (tick all that apply)
> because we do not have a specific
hypothesis on how the means differ
to determine which means are
significantly different without inflating
false-positive rates
determine which means are significantly
different if the f is significant
determine which means are significantly
different if the F is non-significant
> because we do not have a specific
hypothesis on how the means differ
to determine which means are
significantly different without inflating
false-positive rates
determine which means are significantly
different if the f is significant
Statistically, a within-subjects design is more powerful (gives a bigger F) than between-subjects design
MS residual is bigger
MS residual is smaller
MS between is bigger
MS between is smaller
MS residual is smaller
Three Effects/patterns being tested in Factorial ANOVAS
Main effect
Main effect
Interaction
*all with their own F-Ratio, P-value, effect
sizes but the same 2x df
*all independent to one another (i.e., any
combination is possible)
ANOVA output with p-values tells us if there is a statistically significant difference but not between which groups. We need…
Do we look at all of them?
Post hocs or contrasts.
No. We look at our hypothesis and our graph to see which groups to compare.
post hocs help us describe…
interactions
Do we report main effects that are qualified by significant interactions?
No. It would be misleading to report it as significant when we no the effect is only true some of the time (i.e., its effect depends on the level of the second IV).
We still report it in our write up BUT we say main effect of drug was qualified by a significant interaction between drug and therapy…
To interpret the interaction we split the IVsinto 2 levels, and compared the effect of of the other variable at each level. If I choose to split therapy, I would examine the effect of drug in CBT conditions and in the waitlist conditions. If I did this, which two rows of the post hoc table would I want to look at? (select 2)
> Waitlist Placebo vs Waitlist Prozac
CBT Placebo vs CBT Prozac
CBT Prozac vs Waitlist Prozac
CBT Placebo vs Waitlist Placebo
> Waitlist Placebo vs Waitlist Prozac
CBT Placebo vs CBT Prozac
hint if I split “therapy” into CBT and Waitlist
then the post hocs would be (CBT: both
drugs; Waitlist: both drugs).
If I split “drug” into placebo and Prozac then we would look at (Placebo: both therapies; Prozac; both therapies)
Problems with group designs which make small n designs better:
• So far we have looked at comparing group
means.
• Looking at this sample data we may
conclude that our intervention was
successful because we’ve seen a reduction
in scores overtime.
• However, what tends to happen in group
designs is that the effects of an individual
are hidden within group means.
• Whilst most participants seem to reduce in
aggression overtime there are a couple
who either have no change or increase
their aggression overtime.
• At a group level this is not an issue (some
with a strong effect, weak effect or no
effect) but we may ask ourselves what can
we do to make this intervention more
effective for those people?
• We could look more closely at those it
worked very well for; is this an individual
effect, third variable or effectiveness of the
IV?
When do we use small-N designs?
When do we use small-N designs?
§ To establish a causal effect of IV on DV
within a small number of participants
§ For example:
• Research question concerns a very small
sample (special needs children, clinical
populations, prison)
• Situations where we cannot recruit a
sufficiently-powered sample (hard to find
populations or special populations; clinical,
prison, children)
• When we expect substantial variability in
individual responses (if you find two distinct
subgroups within a single sample where
inferential statistics would average the
means and rid the effect but the effect in
itself is worth studying).
Establishing causality
(in a group study)
(in a small n design)
Establishing causality (in a group study) • If we see a change in means (two groups) with small variability, and we are confident that this change is caused by our IV, then we begin to infer a causal relationship.
Establishing causality
§ A small-N design establishes causal
relationships through replicating the effect
of IV on the DV
§ Need to find evidence of three things:
• Consistent (systematic) change in DV as IV
is manipulated, with little variability
• Direct replication of the IV’s effect within
the participant (not another subject factor
etc.; direct replication or systematic
replication can achieve this; same person
same context or different person same
context to test for consistency of the IV-DV
effect within the context)
• Systematic replication of the IV’s effect
across participants or contexts
*If we can replicate the IV-DV effect multiple
times either directly or systematically than
we can infer causation
Basic components of a small-N design
(A) Data Collection:
Basic components of a small-N design
(A) Data Collection:
RQ: Does my intervention result in fewer
aggressive responses to the caretaker?
• Participants: Single participant exhibiting
aggressive behaviour
• The goal is to test and intervention to see if
we can decrease the child’s aggressive
behaviour
• DV: Counting the instances of verbal
aggression towards the caretaker during
playtime
• IV: My intervention is to verbally praise the
child when they are engaging in non-
aggressive interactions during playtime.
• Keeping track of child’s progress with a
graph (responses on y-axis, trials on x-axis)
o Begin with baseline condition
(behaviour without intervention; multiple
trials to have ”normal” behaviour to
compare the intervention to) (A)
o Intervention phase (several
observations/trial) which we compared to
the baseline to test whether their target
behaviour increases, decreases or stays
the same.
o Series of trials/observations which happen
under the same conditions are called a
“phase” (i.e., baseline and intervention
phase). (B)
*Goal is to establish whether intervention
effects behaviour relative to baseline