4 Reliability MC Flashcards
1 According to classical test score theory, what happens to the true score variance as error
in a measure increases?
A it increases
B it decreases
C it remains constant
D classical test score theory makes no statement on this point
B it decreases
2 According to classical test score theory, a test score is made up of
A true score variance and nonsystematic variance
B observed score variance and true score variance
C observed score variance and error variance
D observed score variance and systematic variance
A true score variance and nonsystematic variance
3 The wording of several items on a psychological test makes it more likely that test
takers will endorse the ‘Yes’ rather than the ‘No’ option. This is best described as
A systematic variance in the test
B unsystematic variance in the test
C clever item writing
D a problem for the test taker
A systematic variance in the test
4 Systematic error in a test exerts what kind of effect on test scores? A random B consistent C unknowable D inconsistent
B consistent
5 Another way of talking about the reliability of a test for a particular purpose is to talk about its A dependability B validity C utility D discriminability
A dependability
6 The proportion of observed score variance attributable to random error is known as A the reliability coefficient B the coefficient of nondetermination C the error coefficient D one minus the reliability coefficient
D one minus the reliability coefficient
7 Test-retest reliability is sometimes referred to as A stability 17 B consistency C long-term reliability D concurrent reliability
A stability
8 The domain sampling model proposes that
A items in a test are a random sample from a population of possible items
B the only items possible have been used in the test
C items have been sampled without replacement
D the majority of items have the same content
A items in a test are a random sample from a population of possible items
9 The domain sampling model as originally conceived could not deal well with A split half reliability B internal consistency reliability C equivalent forms reliability D test-retest reliability
D test-retest reliability
10 Which of the following procedures does not yield an estimate of the reliability of a test?
A correlating the total of all even-numbered items with the total of all odd-numbered
items
B correlating the total of items in the first half of the test with the total of items in
the second half of the test
C correlating each item with the total score on the test
D finding the average of the correlation of each item with every other item
C correlating each item with the total score on the test
11 Estimating test reliability by correlating scores from two administrations of the test 6
months apart assumes
A the trait being measured changes over time
B the trait being measured is essentially episodic in character
C the trait being measured does not change over time
D there is a systematic practice effect on the test
C the trait being measured does not change over time
12 The reliability of expert judgment can be estimated by
A correlating the judgments made by a panel of experts over a number of instances
of judgment making
B counting the frequency of instances in which a panel of experts disagree
C finding the proportion of instances in which a panel of experts is undecided
D averaging the number of decisions a panel of experts gets wrong
A correlating the judgments made by a panel of experts over a number of instances
of judgment making
13 Inter-rater reliability
A overcomes the problems of test reliability
B is a special case of test reliability
C cannot be estimated statistically
D uses the same formula as that used for equivalent forms reliability
B is a special case of test reliability
14 The concept of ‘domain sampling’ in the psychometric theory of reliability refers to
A sampling persons from the population with whom a test may be used
B sampling items from the population of possible items that could be used in a test
C sampling tests from the population of tests available to measure a construct
D sampling methods from the population that could be used to construct a test
B sampling items from the population of possible items that could be used in a test
15 The standard error of measurement of a raw score
A increases directly as the reliability increases
B decreases directly as the reliability increases
C increases proportionately as the reliability increases
D decreases proportionately as the reliability increases
B decreases directly as the reliability increases
16 In making judgments about the precision of a score on a test we need to know
A the reliability of the test for the purpose for which we are using it
B the standard deviation of scores on the test
C the mean and standard deviation of scores on the test
D the reliability of the test for the purposes for which we are using it and the
standard deviation of scores on the test
D the reliability of the test for the purposes for which we are using it and the standard deviation of scores on the test
17 Equivalent forms of a test are usually developed
A when the test is first developed
B when the test’s reliability is first questioned
C when the test is first readministered
D when the test is being revised
A when the test is first developed
18 The Spearman-Brown prophecy formula is so called because it purports to indicate
A what the reliability of the test would be if certain changes were made to it
B what the individual’s true score on the test is
C what an individual’s score on the test will be at some future time
D what the person’s true score would be if the test were lengthened
A what the reliability of the test would be if certain changes were made to it
19 The Spearman-Brown prophecy formula requires
A the reliability of the current test
B the number of items in the current test
C both A and B
D neither A nor B
C both A and B
20 The internal consistency of a test would be high if
A it included items that related to different aspects of the construct to be measured
B it included items that related to different constructs
C each item was drawn from a different item domain
D all the items were the same
D all the items were the same
21 A high coefficient alpha indicates that A the test has high generalisability B scores on the test are stable C the test has high internal consistency D the test has only one factor
C the test has high internal consistency
22 Reliability of a test
A can change if the range of scores on the test is smaller relative to the original
sample of scores
B is an unchanging property of a test
C changes from one administration of a test to another
D will differ depending on the mean score of the sample or the test
A can change if the range of scores on the test is smaller relative to the original
sample of scores
23 Coefficient alpha can be calculated
A only for tests with dichotomously scored items
B only for tests with items that have three or more categories
C only for tests that use a Yes/No or True/False format
D for all objectively scored tests
D for all objectively scored tests
24 Generalisability theory requires that we know A the reliability of the test B the standard error of the test C how the test is to be used D the mean score on the test
C how the test is to be used
25 Expectations about what constitutes a satisfactory degree of reliability
A depend on the purpose for which the test is being used
B have been determined by consensus
C seldom depart from the agreed value of 0.9
D depend on the magnitude of the standard error of measurement
C seldom depart from the agreed value of 0.9
26 In general the best reliabilities have been obtained with psychological tests in the A cognitive domain B personality domain C motivation domain D projective domain
A cognitive domain
27 The correlation between scores on two variables varies
A directly with the product of their reliabilities
B directly with the square root of the product of their reliabilities
C inversely with the sum of their reliabilities
D inversely with the square root of the lower of the two reliabilities
A directly with the product of their reliabilities
28 Two variables may not correlate highly
A because of the poor reliability of one or both of them
B because their standard errors of measurement are skewed in opposite directions
C because their reliabilities are unknown
D because similar items have been used in assessing both variables
B because their standard errors of measurement are skewed in opposite directions
29 Reliability is
A relevant when considering the score a person obtains on a test or other assessment
device
B relevant only when psychological test results are being considered but not when
expert judgements are employed
C irrelevant for most practical decision making with psychological tests
D relevant for tests of intelligence only
A relevant when considering the score a person obtains on a test or other assessment
device
30 Reliability of an assessment device can be improved within limits by
A increasing its length (e.g. using more items)
B decreasing the time taken to administer it
C supplementing it with the judgment of the assessor
D replacing it with the judgment of the assessor
A increasing its length (e.g. using more items)