Test construction (Reliability) Flashcards

1
Q

From the perspective of ____ test theory, variability in test scores reflects two factors: true differences between examinees on the attribute measured by the test and differences due to _____

A

classical; measurement (random) error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reliability is a measure of the amount of variability in obtained test scores that is due to _____ variability

A

true score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

A test’s reliability is commonly estimated by calculating a reliability coefficient, which is a type of _____ coefficient

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The reliability coefficient ranges in the value from _______ and is interpreted directly as a measure of _______ variability

A

0 to +1.0; true score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a test has a reliability coefficient of .91, this means that ___% of variability in obtained test scores is due to _____ variability, while the remaining 9% reflects _______

A

91; true score; measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Test-retest reliability is assessed by administering a test to the same group of examinees at two different _______ and them _____ two sets of scores

A

times; correlating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The test-retest reliability coefficient is also known as the coefficient of _______

A

stability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

An alternate forms reliability coefficient is calculated by administering two _____ of a test to the same group of examinees and correlating the two sets of scores

A

equivalent forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The alternate forms reliability coefficient is also referred to as the coefficient of ______

A

equivalence (or equivalence and stability when there is a long period of time between administration of the two forms)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A ______ reliability coefficient is calculated by splitting the test in half and correlating examinees’ scores on the two halves

A

split half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Because the size of a reliability coefficient is affected by test length, the split-half method tends to _____ a test’s true reliability

A

underestimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The ______ formula is often used in conjunction with split-half reliability to obtain an estimate of what a test’s true reliability is

A

Spearman-Brown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Coefficient ______, another method used to assess internal consistency reliability, indicates the average inter-item consistency rather than the consistency between two halves of the test

A

alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The Kuder-Richardson Formula 20 can be used as a substitute for coefficient alpha when test items are scored ______

A

dichotomously

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Split-half reliability, coefficient alpha, and KR-20 are not appropriate for speed tests because they tend to _____ the reliability of these tests

A

overestimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Inter-rater reliability should be assessed whenever a test is ______ scored

A

subjectively

17
Q

The scores assigned by different raters can be used to calculate a ______ coefficient– for example, the _______ statistic which can be used when ratings represent a nominal or ordinal scale of measurement.

A

correlation (reliability); kappa

18
Q

Alternatively, percent agreement between raters can be calculated. A problem with this approach is that the resulting index of reliability can be artificially inflated by the effects of ______

A

chance agreement

19
Q

The magnitude of a reliability coefficient is affect by several factors. In general, the longer a test, the ______ its reliability coefficient

A

larger

20
Q

The _____ formula is used to estimate the effects of lengthening or ______ a test on its reliability coefficient.

A

Spearman-Brown; shortening

21
Q

If the new items do not represent the same content domain as the original items or are more susceptible to measurement error, this formula is likely to _____ the effects of lengthening the test

A

overestimate

22
Q

Like other correlation coefficients, the reliability coefficient is affected by the range of scores: The greater the range, the _______ the reliability coefficient

A

larger

23
Q

To maximize a test’s reliability coefficient, the tryout sample should include people who are _____ with regard the attributes measured by the test

A

heterogeneous

24
Q

A reliability coefficient is also affects by the probability that an examinee can select the correct answer to a test question simply by guessing. The easier it is to guess the correct answer, the ______ the reliability coefficient

A

smaller

25
Q

While the reliability coefficient is useful for assessing the amount of variability in test scores that is due to ____ variability for a group of examinees, it does not directly indicate how much we can expect an individual examinee’s obtained score to reflect his or her true score. The standard error of _______ is useful for this purpose.

A

true score; measurement

26
Q

The standard error of measurement is calculated by multiplying the standard deviation of test scores by the ________ of one minus the reliability coefficient

A

square root

27
Q

If a test’s standard deviation is 10 and its reliability coefficient is .91, the standard error of measurement is equal to ______

A

3.0

28
Q

The standard error of measurement is used to construct a ____ interval around an examinee’s obtained (“measured”) score

A

confidence

29
Q

In terms of magnitude, the standard error of the difference between two scores is always _____ than the SEM of either score because it reflects measurement error from both test scores

A

larger