quiz 5: reliability part one Flashcards
why is reliability important
- Ensures we are measuring something meaningful
- Random measurement error (noise) is always present
what is error
-Inevitable
- Occurs when measurement of a construct is confounded by factors that are not relevant to the construct we want to measure
- Motivation or lack of effort when measuring intellectual ability
- Reading comprehension when measuring quantitative ability
what is systematic error
- Systematic Error = a “mistake” that can be corrected and eliminated
- Mistakenly keyed correct answers
- Lack of familiarity with scoring criteria
what is random error
- Impossible to eliminate
- Inherent in any measurement attempt
- “Random” means it is NOT correlated with the obtained scores
- Therefore, it is impossible to measure (or eliminate)
what is observed score
- (obtained, measured) score = score we obtain whenever we administer a test
- The person’s measured standing on the trait we are interested in measuring
what is true score
- what the person’s score would be if there were no measurement error
- The person’s actual standing on the trait or the actual “amount” of the trait they possess.
are observed scores and true scores the same
Because error is inevitable, the observed score might not be (usually isn’t) the same as the true score
what is classic reliability theory
- If Xo is the observed score
- Xt is the true score, and
- e is the amount of error, then
- Xo = Xt + e*
what are the implications of error (4 of them)
1) Error cannot be measured (1)
- Error scores are RANDOM
- Uncorrelated with either observed or true scores
- Error affects a person’s observed score in ways that are independent of his/her true score
2) Error tends to cancel out across respondents (2)
- Inflates the scores of some and deflates the scores of others
- Average effect of error across respondents is zero i.e.
- This means that the mean of the observed scores is equal to the mean of the true score Xo = Xt
3) Since error is random, i.e., uncorrelated with either the true score or the obtained score, then the following also applies: (3)
- s2o = s2t + s2e (variance of the observed score, variance of the true score, error variance – can never be zero)
- NOTE: Variance of X+Y = Variance of X plus Variance of Y plus Covariance of X and Y. If X and Y are not correlated, then Cov (X,Y) = 0.
4) Variance of the observed scores is always larger than the variance of the true scores (4)
- Error makes people look more different from one another than they actually are
how to get reliability equation
on paper
what does reliability in words mean
- If the reliability of a test is .95, then 95% of the variability in obtained scores is due to actual (real, true) differences among test-takers on the trait (construct) being measured
- 95% of the differences we see among the observed scores of the test takers can be attributed to differences in their true levels on the trait being measured
explain error variance
The term on the right side of the + sign tells us the proportion of variance in observed scores that is due to error variance
-If the reliability = .95, then 5% of the variance in observed scores is due to error variance
what part of the equation is error variance
on paper
implications of reliability
- Reliability is a theoretical property of a test and cannot be computed directly
- It can only be estimated from real data
- There is no single method that provides completely accurate estimates of reliability under all conditions
- We can never calculate “the” reliability of a test
- Instead, we can calculate the effect on a test’s reliability of different sources of error
typical sources of measurement error
- Time Sampling (when the test is given)
- Time of the day a person takes a test
- Item Sampling (which items were selected)
- Internal Consistency (whether the items are all measuring the trait of interest)
- Inter-rater Differences (whether different raters assign the same score)
what is time sampling
- Measurements of a (presumably stable) trait can (and often) change from day to day
- Time Sampling Error refers to error associated with “when” the test is given
what is test/retest method
-The same exact test is re-administered to the same exact group of examinees at a later date
-A correlation coefficient is calculated on the two sets of scores
-This correlation coefficient is the
Test-Retest Reliability Coefficient
-Also known as the test’s stability
what are the factors affecting test-retest reliability estimates
- Length of test-retest interval
- Practice Effect
explain test-rested interval
if the interval is too short, then reliability of test will be overestimated by the test/retest method because of similar conditions affecting both test and retest and memory for original responses
if the interval is too long then reliability of test will be underestimated by the test/retest method because of real changed in the trait being measured
what is the best test-retest interval
- There is no answer
- Any given TRT coefficient is just an estimate of the test’s “true” TRT reliability
- Reported TRT coefficients should be evaluated in light of
- The TRT interval
- The sample composition
what is the sample composition and reliability estimates
Larger, heterogeneous samples -> more accurate reliability estimates
Smaller and/or homogeneous samples -> less accurate reliability estimates (restriction of range)
explain practice effect
- Second administration is not equivalent to first administration due to previous exposure to the items
- If we assume that practice leads to better scores on retest, then practice will DECREASE the correlation between test and retest and lead to an under-estimate of the test/retest correlation and an over-estimate of time sampling error
predicting retest score variable meanings
-zR is the predicted retest score (expressed as a z-score)
-zo is the original score (expressed as a z-score)
-rtt is the test-retest -Then:
zR = rtt x zo
explain regression towards the mean
zR = rtt x zo
-Because the TRT coefficient (rtt ) will always be less than +1.00 (because there will always be some time sampling error), retest scores (zR) are predicted to be closer to the mean of the distribution than were the original scores (zo)
-This is known as:
REGRESSION TOWARDS THE MEAN
examples of predicting the retest score from original score
zR = rtt x zo
- If zo = +1.00 and rtt = .90 then zR = +.90
- If zo = -1.00 and rtt = .90 then zR = -.90
- If zo = +2.00 and rtt = .90 then zR = +1.80
- Scores above the mean are predicted to decrease on retest and scores below the mean are predicted to increase on retest (and scores at the mean are predicted not to change)
- The farther from the mean an original score is, the greater the difference between the original and predicted retest raw scores
- In the first case, a score that is 1 sd above the mean is predicted to change by 0.1 sd on retest, but a score that is 2 sd above the mean is predicted to change by 0.2 sd on retest
Example: M = 80, SD = 20, rtt = .90
- If raw score on the first testing was 100 (1 SD above the mean, z = +1.00) then the predicted score on retesting would be 0.9 sd above the mean or 98 (2 points lower/closer to mean than original score)
- If raw score on the first testing was 120 (2 SD above the mean, z = +2.00) then the predicted score on retesting would be 1.8 sd above the mean or 116 (4 points lower/closer to mean than original score)
- Same principle applies to scores below the mean
- If raw score on the first testing was 60 (1 SD below the mean, z = -1.00) then the predicted score on retesting would be 0.9 sd below the mean or 62 (2 points higher/closer to mean than original score)
- If raw score on the first testing was 40 (2 SD below the mean, z = -2.00) then the predicted score on retesting would be 1.8 sd below the mean or 44 (4 points higher/closer to mean than original score)
the lower the TRT reliability …. .?
- The lower the TRT reliability, the greater the regression towards the mean on retest
- The larger the time sampling error (lower TRT reliability), the greater the difference between original and predicted retest scores
regression towards the mean summary
- Retest scores are predicted to be closer to the mean than original scores
- Scores above the mean are predicted to be lower (closer to the mean) on retest
- Scores below the mean are predicted to be higher (closer to the mean) on retest
-Scores farther from the mean are predicted to show larger test-retest changes than scores closer to the mean
- The lower the test-retest reliability, the greater the regression towards the mean
- The lower the TRT, the greater the difference between predicted retest scores and original scores