Chapter 8 Test Development Flashcards
Biased test item:
Biased test item is an item that favours one particular group of examinees in relation to another when differences in group ability are controlled.
p.264
Anchor Protocol?
A test answer sheet developed by a test publisher to check the accuracy of examiner’s scoring. To resolve scoring discrepancies.
How to detect a biased test item?
Methods of item analysis:
Item characteristic curves. Specific items are identified as biased if exhibit differential item functioning.
The item characteristic curves (ICC)
for the different groups should not be statistically different.
What is the order of Test Development from conceptualization?
Test conceptualization Test construction Test Tryout Analysis Revision to Test tryout again p.234
What is a good item on a norm referenced achievement test?
Is an item for which high scorers on the test respond correctly.
Low scorers on the test tend to respond to that item incorrectly.
What pattern should occur on a criterion referenced test?
On a criterion oriented test, the pattern of results may be the same as norm referenced test-
high scorers get a particular item right whereas the low scorers get it wrong.
p.235
Criterion-referenced test: difference …
Ideally, each item on a criterion referenced test addresses the issue of whether the test taker has met a certain criteria - eg pilot.
Norm referenced insufficient when knowledge of mastery is needed.
p.236
Pilot work
Refers to the preliminary research surrounding the creation of a prototype of the test.
Test developer typically attempts to determine how best to measure a targeted construct.
What is scaling?
Scaling is the process of setting rules for assigning numbers in measurement.
A process by which a measuring device is designed and calibrated and by which numbers - scale values - are assigned to different amounts of the trait, attribute or characteristic being measured.
Stanine scale?
When raw scores are transformed to scale that can range between 1 to 9.
What is the MDBS?
The MDBS is an example of a rating scale.
Morally debatable behaviours scale.
30 items.Never justified to always justified -10 point scale.
Rating scales are:
A grouping of words, statements or symbols on which judgements of the strength of a particular trait, attitude or emotion are indicated by the test taker.
p.239
What is a rating scale?
Rating scales are:
A grouping of words, statements or symbols on which judgements of the strength of a particular trait, attitude or emotion are indicated by the test taker.
Used to record judgements of oneself, others, experiences, or objects, and they can take several forms.
p.239
What is a summative scale?
Is where the final test score is obtained by summing the ratings across all the items.
p.240
What is the Likert Scale?
A summative scale used to scale attitudes.
Five alternative responses…sometimes 7.
Usually on an agree - disagree or
approve - disapprove continuum.
Use of scales results in ordinal level data.
Unidimensional raring scale?
Only one dimension is underlying the ratings.
Multidimensional rating scales.
More than one dimension is thought to guide the test taker’s responses.
When more than one dimension is tapped by an item.p241.
Method of paired comparisons?
A scaling method that produces ordinal data.
Test-takers are presented with pairs of stimuli.. two photos, two statements, two objects…
They must select one of the stimuli according to some rule.
p.241
An advantage is that it forces test takers to choose between items.
Categorical scaling
Relies on sorting Stimuli are placed into one of two or more alternative categories that differ quantitatively with respect to some continuum. e.g. MDBS-R eg sorting 30 cards into 3 piles: behaviours never justified sometimes justified always justified
Guttman scale:
Scaling method that yields ordinal level measures.
Items on it range sequentially from weaker to stronger expressions of attitude, belief, or feeling being measured.
Feature is that all respondents that agree with the stronger statements will also agree with the milder statements.
Assessed by a scalogram analysis.
Scalogram analysis.
An item analysis procedure and approach to test development that involves a graphic mapping of a test taker’s responses.
p.242.
Guttman scale.
Item pool
An item pool is the reservoir from which items will or will not be drawn for the final version of a test.
Item format
Variables such as the form, plan, stricture, arrangement, and layout of individual test items…collectively referred to as item format.
Selected response format
Constructed response format.