W3 - Chapter 8 - Test Development - DN Flashcards
1
Q
anchor protocol
A
- a test answer sheet
- developed by a test publisher
- to test the accuracy of examiners’ scoring
p.280
2
Q
biased test item
A
- an item that favours one group in relation to another
- when differences in group ability are controlled
p.271
3
Q
binary-choice item
A
- multiple choice item
- contains only two possible responses (true-false)
p.254
4
Q
categorical scaling
A
- system of scaling
- stimuli placed in one of two or more alternative categories that differ quantitatively with respect to some continuum
p.249
5
Q
categorical scoring
A
- a method of evaluation
- where test responses earn credit toward placement in a particular class/category
- sometimes testtakers must meet a set number of responses corresponding to a particular criterion to be placed in a specific category
- also called class scoring
- contrast with cumulative scoring & ipsative scoring
p.260
6
Q
ceiling effect
A
-
diminished utility of a tool of assessment in distinguishing testtakers at the high end of the ability, trait, or other attribute being measured
p. 259, 307
7
Q
class scoring
A
- a method of evaluation
- where test responses earn credit toward placement in a particular class/category
- sometimes testtakers must meet a set number of responses corresponding to a particular criterion to be placed in a specific category
- contrast with cumulative scoring & ipsative scoring
p.260
8
Q
comparative scaling
A
- in test development
- a method of developing ordinal scales
- through the use of a **sorting task **
- entails judging a stimulus in comparison with every other stimulus used on the test
p.249
9
Q
completion item
A
- requires an examinee to provide a word or phrase that completes a sentence
p. 254
10
Q
computerized adaptive testing (CAT)
A
- an interactive, computer-administered testtaking process
- items are presented to the testtaker, based in part on the testtakers’ performance on previous items
p.15, 255-256
11
Q
co-norming
A
- the test norming process conducted on two or more tests
- using the same sample of testtakers
- when used to validate all of the tests being normed, this process may also be referred to as co-validation
p.138n4, 278
12
Q
constructed-response format
A
- a form of test item requiring a testtaker to construct or create a response
- as opposed to simply selecting a response
- contrast with selected-response format
p.252
13
Q
co-validation
A
- when co-norming is used to validate all of the tests being normed
- this process may also be referred to as co-validation
p.278
14
Q
cross-validation
A
- a revalidation on a sample of testtakers
- other than the testtakers on whom test performance was originally found to be a valid predictor of some criterion
p.278
15
Q
essay item
A
- a test item that requires a testtaker to write a composition
- typically one that demonstrates recall of facts, understanding, analysis, and/or interpretation
p.255
16
Q
expert panel
A
- in test development process
- group of people knowledgeable about - the subject matter being tested, and/or the population for whom the test is being designed
- they can provide input to improve test’s content, fairness etc.
p.274-275
17
Q
floor effect
A
- a phenomenon arising from the diminished utility of a tool of assessment in distinguishing testtakers at the low end of the ability, trait, or other attribute being measured
p. 256-259
18
Q
giveaway item
A
- a test item, usually near the beginning of a test of ability or achievement
- designed to be relatively easy
- usually for the purpose of building the testtakers confidence or reducing test-related anxiety
p.263n4
19
Q
What three criteria must be met when correcting for the impact of guessing?
A
- must recognize that guesses are not normally totally random
- must deal with the problem of omitted items
- some testtakers are lucky and others unlucky
p.269-271
20
Q
Guttman scale
A
- a scale - items range sequentially from weaker to stronger expressions of the attitude or belief being measured
- constructed so that selection of an earlier item presumes that all following items are also true of the testtaker
- named after its developer
p.249
21
Q
ipsative scoring
A
- approach to scoring & interpretation
- responses & presumed strength of measured trait are interpreted relative to the measured strength of other traits for that testtaker
- contrast with class scoring & cumulative scoring
p.260
22
Q
item analysis
A
- general term used to describe various procedures
- usually statistical, designed to explore how individual items work compared to others in the test & in the context of the whole test
- e.g., to explore the level of difficulty of individual items on an achievement test
- e.g., to explore the reliability of a personality test
- contrast with qualitative item analysis
p.262-275
23
Q
item bank
A
- a collection of questions to be used in the construction of a test
p. 255, 257-259, 282-284