Test construction (Reliability) Flashcards
From the perspective of ____ test theory, variability in test scores reflects two factors: true differences between examinees on the attribute measured by the test and differences due to _____
classical; measurement (random) error
Reliability is a measure of the amount of variability in obtained test scores that is due to _____ variability
true score
A test’s reliability is commonly estimated by calculating a reliability coefficient, which is a type of _____ coefficient
correlation
The reliability coefficient ranges in the value from _______ and is interpreted directly as a measure of _______ variability
0 to +1.0; true score
If a test has a reliability coefficient of .91, this means that ___% of variability in obtained test scores is due to _____ variability, while the remaining 9% reflects _______
91; true score; measurement error
Test-retest reliability is assessed by administering a test to the same group of examinees at two different _______ and them _____ two sets of scores
times; correlating
The test-retest reliability coefficient is also known as the coefficient of _______
stability
An alternate forms reliability coefficient is calculated by administering two _____ of a test to the same group of examinees and correlating the two sets of scores
equivalent forms
The alternate forms reliability coefficient is also referred to as the coefficient of ______
equivalence (or equivalence and stability when there is a long period of time between administration of the two forms)
A ______ reliability coefficient is calculated by splitting the test in half and correlating examinees’ scores on the two halves
split half
Because the size of a reliability coefficient is affected by test length, the split-half method tends to _____ a test’s true reliability
underestimate
The ______ formula is often used in conjunction with split-half reliability to obtain an estimate of what a test’s true reliability is
Spearman-Brown
Coefficient ______, another method used to assess internal consistency reliability, indicates the average inter-item consistency rather than the consistency between two halves of the test
alpha
The Kuder-Richardson Formula 20 can be used as a substitute for coefficient alpha when test items are scored ______
dichotomously
Split-half reliability, coefficient alpha, and KR-20 are not appropriate for speed tests because they tend to _____ the reliability of these tests
overestimate
Inter-rater reliability should be assessed whenever a test is ______ scored
subjectively
The scores assigned by different raters can be used to calculate a ______ coefficient– for example, the _______ statistic which can be used when ratings represent a nominal or ordinal scale of measurement.
correlation (reliability); kappa
Alternatively, percent agreement between raters can be calculated. A problem with this approach is that the resulting index of reliability can be artificially inflated by the effects of ______
chance agreement
The magnitude of a reliability coefficient is affect by several factors. In general, the longer a test, the ______ its reliability coefficient
larger
The _____ formula is used to estimate the effects of lengthening or ______ a test on its reliability coefficient.
Spearman-Brown; shortening
If the new items do not represent the same content domain as the original items or are more susceptible to measurement error, this formula is likely to _____ the effects of lengthening the test
overestimate
Like other correlation coefficients, the reliability coefficient is affected by the range of scores: The greater the range, the _______ the reliability coefficient
larger
To maximize a test’s reliability coefficient, the tryout sample should include people who are _____ with regard the attributes measured by the test
heterogeneous
A reliability coefficient is also affects by the probability that an examinee can select the correct answer to a test question simply by guessing. The easier it is to guess the correct answer, the ______ the reliability coefficient
smaller
While the reliability coefficient is useful for assessing the amount of variability in test scores that is due to ____ variability for a group of examinees, it does not directly indicate how much we can expect an individual examinee’s obtained score to reflect his or her true score. The standard error of _______ is useful for this purpose.
true score; measurement
The standard error of measurement is calculated by multiplying the standard deviation of test scores by the ________ of one minus the reliability coefficient
square root
If a test’s standard deviation is 10 and its reliability coefficient is .91, the standard error of measurement is equal to ______
3.0
The standard error of measurement is used to construct a ____ interval around an examinee’s obtained (“measured”) score
confidence
In terms of magnitude, the standard error of the difference between two scores is always _____ than the SEM of either score because it reflects measurement error from both test scores
larger