TEST DEVELOPMENT Flashcards
TRUE OR FALSE. An expert panel may be used in the process of test development to provide ratings of item reliability
FALSE
TRUE OR FALSE. Item discrimination refers to the ability of a test item to identify those who score above the median versus below the median
FALSE.
When assessment tool diminishes its ability to distinguish testtakers at the low end of an ability or trait
Floor effect
In a matching item, the testtaker is presented with two columns: _____on the left and ________ on the right
Premises; responses
Ceiling effect happens when assessment tool diminishes its ability to distinguish testtakers at the high end of an ability or trait
Ceiling effect
Can be a true or false exam
Binary-choice items
Useful in measuring responses that require applications and original solutions
Essay,
Testtaker responds with one of two responses
→ Binary-choice items,
Has 3 elements (stem, correct option, distractors)
Multiple-choice,
Consists of premises and responses
Matching,
Probability of obtaining correct item is .5% choice items
Binary-choice items
Sorting attitude as: acceptable, not acceptable
→ Categorical Scaling,
Ranking which among the 20 behaviors provided are acceptable and not acceptable (1 is least acceptable 20 is most acceptable)
→ Rating Scale,
Items are arranged from weaker to stronger expressions of belief, attitude or feeling being measured
→ Comparative scaling,
Respondents who agree with stronger statements will agree with milder statements
→ Guttman Scale,
Describing one’s happiness on a scale of 1 to 10
→ Rating Scale
Compare test takers with each other
norm-referenced
addresses the issue whether a test taker will meet the criteria
criterion-referenced
setting rules for assigning numbers
scaling
L.L. thursstone
scaling
units that makes up a test
test items
refers to all techniques used to assess the characteristics of test items and evaluate their quality
test analysis
rely on judgement from reviewers concerning the substantive and stylistic characteristics of items as well ass their accuracy and fairness
qualitative item analysis
involves a variety of statistical procedures designed to ascertain the psychometric characteristics of items based on their responses obtained from the samples used in the test development
quantitative item analysis
require the examinee to create responses within the structure provided by each item.
Constructed-Response Items
the incorrect alternatives,
Distractors
the question part of a multiple-choice item.
Stem
You can assess higher-level thinking with ______
multiple choice questions.
a measurement format that requires learners to classify a series of examples
using the same alternatives.
Matching
Content should be homogeneous (all material of the same type).
Matching
a measurement format that includes
statements of varying complexity that learners have to
judge as being correct or incorrect.
True-false format
a measurement format that
includes a question or an incomplete statement that
requires the learner to supply appropriate words,
numbers, or symbols.
Completion format
It is very difficult to create ______ items where
only one answer is correct.
Completion
a measurement format that requires
students to make extended written responses to questions
or problems.
Essay format
Scoring them is a challenge.
Essay format
is a scoring scale that describes the criteria for grading.
rubric
A form of assessment in which students demonstrate their
knowledge and skill by carrying out an activity or producing a
product.
Performance Assessment
a relatively large and easily accessible
collection of test questions
Item Bank
interactive
computer-administered test-taking process wherein
items presented to the testtaker are based in part on
the testtaker’s performance on previous items.
Computerized Adaptive Testing
ability of the computer to
tailor the content of the test items on the
basis of responses to previous items
Item branching
test taker responses earn credit toward placement in a particular class or category with other test takers whose pattern of responses is presumably similar in some way.
Class scoring
comparing a testtaker’s
score on one scale within a test to another
scale within the same test
Ipsative scoring
people who are similar in critical respects
to the people for whom the test was
designed for
Test Tryout
the percentage or proportion of test takers who
correctly answer the item
Item Difficulty Index (Item Difficulty
Level)
Refers to how well an item can accurately discriminate between test takers who differ on the construct being measured
Item Discrimination
Item Discrimination Formula
D= pHIGH - pLOW
items for which equally able persons from different cultural group have different probabilities of success
item bias
assessing the quality of each alternatives
analysis of item alternatives
graphic discrimination of item difficulty & discrimination
item-characteristic curve
the degree that a test item is biased
item fairness