Module 7: Test Development Flashcards

Question

What are the different kinds of distractors?

Answer 1

1. Effective distractors 2. Ineffective distractors 3. Cute distractors

Answer 2

a distractor that was chosen equally by both high and low performing groups that enhances the consistency of test results

Answer 3

may hurt the reliability of the test because they are time consuming to read and can limit the no. of good items

Answer 4

less likely to be chosen, may affect the reliability of the test bec the testtakers may guess from the remaining options

Answer 5

Good distractors has been chosen frequently by low scorers

Answer 6

Test taker is presented with two columns: Premises and Responses

Answer 7

Usually takes the form of a sentence that requires the testtaker to indicate whether the statement is or is not a fact (50%)

Answer 8

requires testtakers to supply or to create the correct answer, not merely selecting it

Answer 9

requires the examinee to provide a word or phrase that completes a sentence

Answer 10

Should be written clearly enough that the testtaker can respond succinctly, with short answer

Answer 11

allows creative integration and expression of the material

Answer 12

process of setting rules for assigning numbers in measurement

Answer 13

1. Multiple Choice 2. Matching Items 3. Binary Choice

Answer 14

1. Multiple Choice 2. Matching Items 3. Binary Choice

Answer 15

1. Completion Item 2. Short-Answer 3. Essay

Answer 16

1. Nominal 2. Ordinal 3. Ratio 4. Interval

Answer 17

+ involve classification or categorization based on one or more distinguishing characteristics + label and categorize observations but do not make any quantitative distinctions between observations

Answer 18

+ rank ordering on some characteristics is also permissible + median

Answer 19

+ contains equal intervals, has no absolute zero point (even negative values have interpretation to it) + Zero value does not mean it represents none

Answer 20

+ has true zero point (if the score is zero, it means none/null) + Easiest to manipulate

Answer 21

1. Paired Comparison 2. Rank Order 3. Constant Sum 4. Q-Sort Technique

Answer 22

1. Paired Comparison 2. Rank Order 3. Constant Sum 4. Q-Sort Technique

Answer 23

+ produces ordinal data by presenting with pairs of two stimuli which they are asked to compare + respondent is presented with two objects at a time and asked to select one object according to some criterion

Answer 24

respondents are presented with several items simultaneously and asked to rank them in order or priority

Answer 25

respondents are asked to allocate a constant sum of units, such as points, among set of stimulus objects with respect to some criterion

Answer 26

sort object based on similarity with respect to some criterion

Answer 27

rate the objects by placing a mark at the appropriate position on a continuous line that runs from one extreme of the criterion variable to the other e.g., Rating Guardians of the Galaxy as the best Marvel Movie of Phase 4

Answer 28

having numbers or brief descriptions associated with each category e.g., 1 if your like the item the most, 2 if so-so, 3 if you hate it

Answer 29

+ indicate their own attitudes by checking how strongly they agree or disagree with carefully worded statements that range from very positive to very negative towards attitudinal object + principle of measuring attitudes by asking people to respond to a series of statements about a topic, in terms of the extent to which they agree with them

Answer 30

a 100-mm line that allows subjects to express the magnitude of an experience or belief

Answer 31

derive respondent’s attitude towards the given object by asking him to select an appropriate position on a scale between two bipolar opposites

Answer 32

developed to measure the direction and intensity of an attitude simultaneously

Answer 33

final score is obtained by summing the ratings across all the items

Answer 34

+ involves the collection of a variety of different statements about a phenomenon which are ranked by an expert panel in order to develop the questionnaire + allows multiple answers

Answer 35

the respondent must choose between two or more equally socially acceptable options

Answer 36

the test should be tried out on people who are similar in critical respects to the people for whom the test was designed

Answer 37

An informal rule of thumb should be no fewer than 5 and preferably as many as 10 for each item (the more, the better)

Answer 38

Risk of using few subjects = phantom factors emerge

Answer 39

Should be executed under conditions as identical as possible

Answer 40

A good test item is one that answered correctly by high scorers as a whole

Answer 41

administering a large pool of test items to a sample of individuals who are known to differ on the construct being measure

Answer 42

statistical procedure used to analyze items, evaluate test items

Answer 43

employed to examine correlation between each item and the total score of the test

Answer 44

a blueprint of the test in terms of number of items per difficulty, topic importance, or taxonomy

Answer 45

Define clearly what to measure, generate item pool, avoid long items, keep the level of reading difficulty appropriate for those who will complete the test, avoid double-barreled items, consider making positive and negative worded items

Answer 46

items that convey more than one ideas at the same time

Answer 47

defined by the number of people who get a particular item correct

Answer 48

calculating the proportion of the total number of testtakers who answered the item correctly; The larger, the easier the item

Answer 49

for personality testing, percentage of individual who endorsed an item in a personality test

Answer 50

The optimal average item difficulty is approx. 50% with items on the testing ranging in difficulty from about 30% to 80%

Answer 51

Very difficult

Answer 52

Average/moderately difficult

Answer 53

items in an ability are arranged into increasing difficulty

Answer 54

provides an indication of the internal consistency of a test

Answer 55

The higher Item-Reliability index, the greater the test’s internal consistency

Answer 56

designed to provide an indication of the degree to which a test is measure what it purports to measure

Answer 57

The higher Item-Validity index, the greater the test’s criterion-related validity

Answer 58

measure of item discrimination; measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly

Answer 59

compares people who have done well with those who have done poorly

Answer 60

difference between these proportion

Answer 61

correlation between a dichotomous variable and continuous variable

Answer 62

Very good item

Answer 63

graphic representation of item difficulty and discrimination

Answer 64

one that eluded any universally accepted solutions

Answer 65

Item analyses taken under speed conditions yield misleading or uninterpretable results

Answer 66

Restrict item analysis on a speed test only to the items completed by the testtaker

Answer 67

Test developer ideally should administer the test to be item-analyzed with generous time limits to complete the test

Answer 68

1. Cumulative Model 2. Class Scoring/Category Scoring 3. Ipsative Scoring

Answer 69

testtaker obtains a measure of the level of the trait; thus, high scorers may suggest high level in the trait being measured

Answer 70

testtaker response earn credit toward placement in a particular class or category with other testtaker whose pattern of responses is similar in some way

Answer 71

compares testtaker’s score on one scale within a test to another scale within that same test, two unrelated constructs

Answer 72

+ Characterize each item according to its strength and weaknesses + As revision proceeds, the advantage of writing a large item pool becomes more apparent because some items were removed and must be replaced by the items in the item pool + Administer the revised test under standardized conditions to a second appropriate sample of examinee

Answer 73

revalidation of a test on a sample of testtakers other than those on who test performance was originally found to be a valid predictor of some criterion; often results to validity shrinkage

Answer 74

decrease in item validities that inevitably occurs after cross-validation

Answer 75

conducted on two or more test using the same sample of testtakers

Answer 76

creation of norms or the revision of existing norms

Answer 77

test protocol scored by highly authoritative scorer that is designed as a model for scoring and a mechanism for resolving scoring discrepancies

Answer 78

discrepancy between scoring in an anchor protocol and the scoring of another protocol

Answer 79

item functions differently in one group of testtakers known to have the same level of the underlying trait

Answer 80

test developers scrutinize group by group item response curves looking for DIF Items

Answer 81

items that respondents from different groups at the same level of underlying trait have different probabilities of endorsing a function of their group membership

Answer 82

subtest used to direct or route the testtaker to a suitable level of items

Answer 83

setting cut scores that entails a histographic representation of items and expert judgments regarding item effectiveness

Answer 84

the level of which a the minimum criterion number of correct responses is obtained

Answer 85

+ standardized test administration is assured for testtakers and variation is kept to a minimum + test content and length is tailored according to the taker’s ability

Module 7: Test Development Flashcards

(115 cards)