Exam 3 Flashcards
computerized adaptive testing (CAT)
an interactive, computer administered test-taking process wherein items presented to the testaker are based in part on the testaker’s performance on previous items
efficient in testing time and number of items
and reduces floor and ceiling effects
floor effect
diminished ability to distinguish testtakers at the low end of the continuum (too difficult)
ceiling effect
diminished ability to distinguish testtakers at the high end of the continuum (too easy)
cumulatively scored
assumption that the higher the score on the test, the higher the testtaker is on the ability, trait, or other characteristic that the test purports to measure- total score with range on a continuum
class scoring
responses earn credit toward placement in a particular class or category with other testtakers whose pattern of responses is presumably similar in some way
ipsative scoring
comparing a testtaker’s score on one scale within a test to another scale within that same test
test tryout
test should be tried out on same population it was designed for
5-10 responders per item, bigger sample size, higher generalizability
in the same manner, same instructions
what is a good item?
reliable and valid
discriminates testtakers
item-difficulty index
the proportion of respondents answering an item
item-endorsement index
the percentage of agreement as opposed to percentage correct
item-reliability index
indication of the internal consistency of the scale
factor analysis can also provide an indication of whether items that are supposed to be measuring the same thing load on a common factor- how do the items correlate to each other
item-validity index
allows test developers to evaluate the validity of items in relation to a criterion measure
item-discrimination index
indicated how adequately an item separates or discriminated between high scorers and low scorers
d-value
the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly
a parameter
relatedness (slope) of the item to the latent construct
discrimination
b parameter
point on the latent construct where the probability of endorsing the item equals 0.50 while controlling for mean differences along the continuum
difficulty
item fairness
the degree to which a test item is biased
speed tests
the closer an item is to the end of the test, the more difficult it may appear to be
differential item functioning
item functions differently across groups
qualitative methods
techniques of data generation and analysis that rely primarily on verbal rather than mathematical or statistical procedures
examples of qualitative methods
think aloud, expert panels, sensitivity review
revision in new test development
items evaluated to strengths and weaknesses
some items are replaced with items from item pool
revised tests will be administered under standardized conditions to a second sample
once a test has been finalized, norms may be developed from the normative sample and is standardized
cross-validation
revalidation of a test on a sample of testtakers other than those on whom test performance was originally found to be a valid predictor of some criterion
validity shrinkage- most valid when first created
co-validation
a test validation process conducted on two or more tests using the same sample of testtakers
economical and minimizes sampling error
applications of IRT in building and revising tests
evaluating existing tests for the purpose of mapping test revisions
determining measurement equivalence across testtaker populations
developing item banks
intelligence
a multifaceted and dynamic across the lifespan