[L5] Repeated Measures Experiments Flashcards
Two Types of Experimental Designs
Repeated Measures Design
Independent Groups Design
comparing the scores of
individuals in one condition against their scores in
another condition.
Repeated Measures Design
comparing the scores of
one group of people taking one condition against the
scores of a different group of people in the other
condition.
Independent Groups Design –
Other Names:
Within-subjects studies / designs
Related groups / design
Cross-over studies / design
change within a
group of individuals, rather than between two groups.
Within-subjects studies / designs –
people undergoing two
different treatments are closely matched, so that the two
groups are not independent, rather they are related.
Related groups / design –
*strictly speaking, a ___design is a type
of related design; however, it is very rare to encounter a
study where both groups are ____.
repeated measures; sufficiently closely
matched
term mainly used in
medical research than commonly used in psychology.
People cross-over from one group to the other group.
Cross-over studies / design
Common Mistakes:
Correlational and Repeated Measures
Designs
__ – when we want to see if people who
were high scorers on one test are also high scorers on the
second test. We are not interested in whether the scores
overall have gone up or down.
Correlational
– if people on average, score
higher on one occasion than the other.
Repeated Measures
Advantages of Repeated Measures Design
- There is no need for many participants.
- Each person acts as their own (perfectly) matched
control group.
Disadvantages of Repeated Measures Design
- Practice effects
- Sensitization –
- Carry-over effects
participants gets better at a task over
time. (solutions: counterbalancing and practice items)
Practice effects –
– participants may perceive that a
dependency exists between two measures, and
deliberately keep their answers similar when we are
looking for change. Alternatively, because the
participants perceive that the researcher is looking for
change, they might change their answers.
Sensitization
occurs when something about the
previous condition is “carried over” into the next
condition.
Carry-over effects –
Statistical Tests for Repeated Measures Designs
- The Repeated Measures t-test –
- The Wilcoxon test
- The Sign Test
parametric test for
continuous data.
The Repeated Measures t-test –
non-parametric test for ordinal
data
The Wilcoxon test –
non-parametric test for categorical
data
The Sign Test –
the most powerful, and most likely to spot
significant differences in data. It can not be used
however with all repeated measures data. Data should
also satisfy some conditions before this test can be used.
t-test -
deals with all data that can be ordered
(ordinal data).
Wilcoxon test –
– only deals with data in the form of cat0egories
(nominal data). Easy to understand and calculate.
Sign test
Down side of _____ – only deals with crude categories, rather
than rich data of ranks or intervals.
Sign test
The Repeated Measures t-test
To use this test, we need to make 2 assumptions about our
Data:
The data are measured on a continuous (interval)
level.
2. The differences between the two scores are normally
distributed.
It makes no assumption about the distribution of the
scores. Only the ___ between the scores.
differences
It is possible to have variables which have highly ___, but which have normally
distributed differences.
nonnormal
distributions
It makes no assumption about the ___ of the
variables.
variances
Given a sufficiently large sample, the repeated measures
t-test is ____ of both these
assumptions.
robust against violations
Above sample sizes of approximately ___, the test
becomes very robust to violations of distributional
assumptions.
50
- Used when data do not satisfy the assumptions of the
repeated measures t-test.
The Wilcoxon Test
When is Wilcoxon used
The differences are not normally distributed
* The measures are ordinal.
* Non-parametric test
– makes inferences about population
parameters.
Parametric test
use ranks.
Non-parametric tests
If data were not measured on an ___ scale we convert
them.
ordinal; continuous to ranks
developed two statistical tests namely
the Wilcoxon-rank sum and the Wilcoxon signed ranks
test.
Frank Wilcoxon –
Frank Wilcoxon – developed two statistical tests namely
the ___ and the
___
Wilcoxon-rank sum; Wilcoxon signed ranks test.
Rank-sum – equivalent to ___which is
easier to calculate
Mann-Whitney test
Thus when we refer to a Wilcoxon test, we refer to the
____ test only.
Signed ranks
- Formula used which converts T value into a value of z.
Normal Approximation
normal approximation Can be used as long as sample size is __
above 10
Used during times when table can’t be used, for example
when the sample size is ___ than the values given in
the table
bigger
Instead we could report the ___, and ____, for each group
medians, inter-quartile
ranges
used to correct for continuity.
The z distribution is continuous – that is, any value at
all is possible.
Continuity Correction –
- Used when we have nominal data with two categories,
and have repeated measures data.
Sign Test
Easiest statistical test.
Sign Test
How to test the statistical significance of a t-score (3 ways)
- Get the p-value of the t-score (probability of getting
the score as a result of chance if the NULL is true) - Get the t-critical value and compare the t we got.
- Calculate the Confidence Intervals
For result to be significant, p-value should be low. It
should ____ we use for
significance testing (alpha levels could be 0.05, 0.01,
0.001 etc. Choice depends on a researchers tolerance for
error)
be equal to or less than the alpha level
The
____ is the t-score which has a p-value equal to
the alpha level we use. It is relative to the sample size as
exemplified by the use of the concept of degrees of
freedom
t-critical
For our t (the t-score we got) to be significant it should
be ___the t-critical value.
equal to or more than
The CI tells us the likely ___, or if the population is measured instead of
the sample.
`range of the score in the
population
In this test, the score we are referring to is the __
_ (summation of difference scores computed from
the two groups/conditions)
difference
score
We are calculating the CI because we do not usually
have means to measure the population, hence we will
really not know the _____. The most that
we can do is estimate the ___
TRUE DIFFERENCE
SCORE.
The CI provides us that estimate by giving us the likely
__
_ if the population
is measured
range of values of the difference score
Why the middle 95% of cases? Because, the scores
within this area, range, or interval are the scores
considered to have
“____ compared to the other scores in the
distribution.
higher p-values or probabilities of
occurring
Thus, if our result is statistically significant, it should be
contained within the CI. It follows the rule that, if the __ could
be true, it should have a high probability of occurring
alternative hypothesis
This is opposite of what we assume in the first place that
if the Null Hypothesis is true, our result should have a
_
_
__
low p-value or probability of occurring.
The Null Hypothesis then should not be contained within
the CI for our result to be ___
significant
If it happens that the Null hypothesis is contained in the
CI, the result is
__
not significant.
However, the Wilcoxon T distribution is not truly
continuous, because it can only change in __
steps
It is hard to see how we could have got a value of 28.413
from our data, because we added __.
ranks
In continuity correction; we just add __ to the top of the
equation:
-0.5
if we employ a continuity correction, we
are sure that our __rate is controlled at, or
below 0.05.
type 1 error
If we don’t, then the type 1 error rate might be ___
above
0.05.
However, the price of using the continuity correction is
(always) a ___
slight loss of power.
Sheskin (2003) suggests that we should perhaps analyze
the data ___, once with and once without the
continuity correction.
* If it makes a difference, then we should collect more
data.
twice
the calculation of z assumes that
there are ___ – that is, no one had the
same score on both occasions. If this is the case, we need another/further correction.
no ties in the data
We would add the number of ties, called t, into the
___
equation
The test statistic from the sign test is called __, the smaller of these two values.
S,
The sign test uses N as the total number of people from
whom there was ___
not a tie.