Test Construction Flashcards

1
Q

When item response theory has been used as the basis for test construction, an examinee’s score on the test provides information about his/her:
Select one:

A.
future status on an external criterion

B.
status on a latent trait or ability

C.
performance relative to other examinees

D.
performance relative to a prespecified standard

A

The correct answer is B.

Item response theory differs from classical test theory in several ways, including the interpretation of an examinee’s scores. One of the primary characteristics of item response theory is that it is based on the assumption that “the performance of an examinee on a test item can be explained (or predicted) by a set of factors called ‘traits,’ ‘latent traits,’ or ‘abilities.’” Source: D. H. Henard, Item response theory, in L. G. Grimm and P. R. Yarnold (eds.), Reading and understanding more multivariate statistics, Washington, DC, APA, 2000.

Answer A: Predictive validity is a type of criterion-related validity used when the purpose of testing is to predict future status on a criterion.

Answer C: Norm-referenced interpretation involves comparing an examinee’s performance to other examinees.

Answer D: Criterion-referenced interpretation provides information about an examinee’s score in relation to a prespecified standard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

To construct the 68% confidence interval for an examinee’s obtained test score, you would need the examinee’s score and:
Select one:

A.
the test’s mean

B.
the standard deviation

C.
the standard error of measurement

D.
the standard error of estimate

A

The correct answer is C.

To construct a confidence interval around an obtained test score, you need the standard error of measurement (which is calculated from the test’s standard deviation and reliability coefficient). To construct a 68% confidence interval, you add and subtract one standard error of measurement to and from the examinee’s obtained test score.

Answer A: The mean is not needed to construct a confidence interval.

Answer B: The standard deviation is one of the elements needed to calculate the standard error of measurement but, by itself, cannot be used to construct a confidence interval.

Answer D: The standard error of an estimate is used to construct a confidence interval around a predicted criterion score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Assuming that the following scores are all from the same normal distribution of scores, which of the following lists the scores in order from lowest to highest?
Select one:

A.
Z-score of +1.0, percentile rank of 70, T-score of 80

B.
Z-score of +.75, percentile rank of 84, T-score of 65

C.
Z-score of +1.25, percentile rank of 95, T-score of 55

D.
Z-score of +.50, percentile rank of 98, T-score of 60

A

The correct answer is B.

For the exam, you want to be familiar with the relationship between z-scores, percentile ranks, and T-scores. Converting the scores to standard deviation units would have helped you identify the correct answer to this question. A z-score of +.75 is 3/4ths of a standard deviation above the mean; a percentile rank of 84 is one standard deviation above the mean; and a T-score of 65 is 1-1/2 standard deviations above the mean. Therefore, this answer lists the scores in order from lowest to highest.

Answer A: A z-score of +1.0 is one standard deviation above the mean, a percentile rank of 70 is below one standard deviation above the mean, and a T-score of 80 is three standard deviations above the mean.

Answer C: A z-score of +1.25 is 1-1/4 standard deviations above the mean, a percentile rank of 95 is more than one standard deviation above the mean, and a T score of 55 is one-half standard deviation above the mean.

Answer D: A z-score of +.50 is one-half standard deviation above the mean, a percentile rank of 98 is two standard deviations above the mean, and a T-score of 60 is one standard deviation above the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The coefficient of stability is useful for:
Select one:

A.
assessing the reliability of a test that is administered on two different occasions to the same group of examinees.

B.
assessing the reliability of two versions of a test that are administered to the same group of examinees.

C.
evaluating the validity of a test across administrations of the test at two different times.

D.
evaluating the validity of a test across different groups of examinees.

A

The correct answer is A.

To answer this question, you need to know that the “coefficient of stability” is another name for the test-retest reliability coefficient. The coefficient of stability indicates the degree of consistency (reliability) of a test across time.

Answer B: The alternate forms reliability coefficient is also called the coefficient of equivalence when the two forms are administered to the same group of examinees at the same time.

Answer C: This is incorrect, as a coefficient of stability is a measure of reliability, not validity.

Answer D: This is incorrect, as a coefficient of stability is a measure of reliability, not validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A factor matrix indicates that Test A has a factor loading of .40 on Factor I and a factor loading of .30 on Factor II. Assuming the factors are orthogonal, what is the communality for Test A?
Select one:

A.
0.1

B.
0.25

C.
0.45

D.
0.7

A

The correct answer is B.

When factors are orthogonal (uncorrelated), a test’s factor loadings can be squared and summed to calculate the communality (the amount of variability in test scores explained by the identified factors). Test A’s communality is equal to .40 squared (.16) plus .30 squared (.09), which is .25.

Answers A, C, and D: These responses are incorrect as test A’s communality is equal .25.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A predictor’s criterion-related validity coefficient is .70. This means that ____% of variability in criterion scores is explained by variability in predictor scores.
Select one:

A.
30

B.
49

C.
51

D.
70

A

The correct answer is B.

The criterion-related validity coefficient is interpreted like any other correlation coefficient for two variables – i.e., it is squared to obtain a measure of shared variability. A validity coefficient of .70 indicates that 49% (.70 squared) of variability is shared by the predictor and criterion – or, put another way, that 49% of variability in criterion scores is explained by variability in predictor scores.

Answers A, C, and D: A predictor’s criterion-related validity coefficient is .70 which means that 49% of variability in criterion scores is explained by variability in predictor scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A predictor’s ___________ is calculated by dividing the number of true positives by the number of true positives plus false negatives.
Select one:

A.
positive predictive value

B.
negative predictive value

C.
sensitivity

D.
specificity

A

The correct answer is C.

The accuracy of a predictor can be described in terms of its sensitivity, specificity, positive predictive value, and negative predictive value. A predictor’s sensitivity refers to the proportion of individuals in the validation sample who have the characteristic measured by the predictor and were accurately identified by the predictor as having that characteristic. It provides an index of the predictor’s ability to identify true positives. Sensitivity is calculated by dividing the number of true positives by the number of true positives plus false negatives.

Answer A: A predictor’s positive predictive value indicates the probability that an individual identified as a positive is a true positive. It is calculated by dividing the number of true positives by the number of true and false positives.

Answer B: A predictor’s negative predictive value indicates the probability that an individual identified as a negative is a true negative. It is calculated by dividing the number of true negatives by the number of true and false negatives.

Answer D: A predictor’s specificity refers to the proportion of individuals in the validation sample without the characteristic measured by the predictor who were correctly identified by the predictor as not having that characteristic. It provides an index of the predictor’s ability to identify true negatives. Specificity is calculated by dividing the number of true negatives by the number of true negatives plus false positives.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

To assess the reliability of a characteristic that fluctuates in severity or intensity over time, you would be best advised to use which of the following?
Select one:

A.
Coefficient of equivalence

B.
Coefficient of stability

C.
Coefficient of determination

D.
Coefficient of internal consistency

A

The correct answer is D.

If the characteristic measured by a test fluctuates over time (i.e., is a “state”), it would not be appropriate to assess the test’s reliability using a method that requires administering the test (or alternative forms of the test) at different times. A coefficient of internal consistency would be appropriate for a test that measures a characteristic that fluctuates over time since it requires administering the test only once.

Answer A: A coefficient of equivalence is obtained when equivalent forms reliability is used. In most cases, it is necessary to administer the different forms at different times; therefore, this method would not be the best for a test that measures a characteristic that fluctuates over time.

Answer B: The coefficient of stability is obtained when using test-retest reliability, which would also be inappropriate for a measure of a characteristic that fluctuates over time.

Answer C: The coefficient of determination is not a measure of reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly