Test Development Flashcards

1
Q

It is the product of the thoughtful and sound application of established principles of test construction

A

Test development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

1st step of Test development

A

Test conceptualization (what, how, who, when, should?)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Preliminary research surrounding the creation of a prototype of the test

A

Pilot study/research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

2nd step of test development

A

Test construction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Process of setting rules for assigning number in measurement

A

scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Credited for being at the forefront of efforts to develop methodologically sound scaling methods

A

LL Thurstone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type of scale the consists grouping of words, statement, symbols on which judgments of the strength of a particular trait, attitude, emotion are indicated by the test-taker

A

Rating scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A scale where the final score is obtained by summing the ratings across all items (e.g. Likert Scale)

A

Summative scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

A scale where test takers are presented with pairs of stimuli which they are asked to compare

A

Method of paired comparison

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Entails sorting tasks and judgments of a stimulus in comparison with every other stimulus on the scale (e.g. sort items from most justifiable to least justifiable)

A

Comparative scaling (ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Stimuli placed into one of two or more alternative categories that differ quantitatively with respect to some continuum

A

categorical scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Respondents who agree with stronger statements of the attitude will also agree with the milder statements

A

Guttman scale (ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Item analysis procedure and approach to test development that involves a graphic mapping of a testtaker’s responses

A

Scalogram analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Scaling method used to obtain data that are presumed to be in interval in nature

A

Equal-appearing intervals (thurstone)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Reservoir or from which items will or will not be bdrawn for the final version of test

A

Item pool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Parts of a multiple-choice item format question

A

stem (sentence)
correct option
distractors/foils

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Also called as short-answer item

A

Completion item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Limitations of essay items

A

Focus on a liimited area; subjectivity in scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Relatively large and easily accessible collection of test questions

A

item bank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Interactive, computer-administered test taking process wherein items presented to the testtaker are based in part on the testtaker’s performance on previous items

A

Computerized-adaptive testing (CAT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Ability of the computer to tailor the content and order of the presentation of test items on the basis of responses to previous items

A

Item branching

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Most commonly used scoring model

A

Cumulative scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A type of scoring used by some diagnostic systems wherein individuals must exhibit a certain number of symptoms to qualify to a specific diagnosis

A

Class/categorical scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Compare testtaker’s score on one scale within a test to another scale within that same test

A

Ipsative scoring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

3rd step in test development

A

test tryout

26
Q

4th step in test development

A

Item analysis

27
Q

Items that spur motivation and positive testtaking attitude and lessen anxiety

A

Give away items

28
Q

Percent of people who said yes, agreed, endorsed the item not who pass the item

A

Item endorsement index

29
Q

Range of the optimal item difficulty

A

0.3-0.8(easy)

30
Q

Formula for OID

A

chance performance +1/2

31
Q

OID for true-false item

A

0.75 (chance=0.5)

32
Q

OID for multiple choice item 4 options

A

0.63 (chance=0.25)

33
Q

OID for multiple choice item 5 options

A

0.60 (chance=0.2)

34
Q

Equal to the product of the item-score standard deviation and the correlation between the item score and the total test score

A

Item reliability index

35
Q

Item Analysis Technique for Questions with right/wrong answers

A

Item Difficulty

Item Discrimination

Distractor Analysis

36
Q

Item Analysis Techniques for either right/wrong answers or self-report scales

A

Item reliability index

Cronbach’s alpha

37
Q

Equal to the item score SD and correlation between item score and criterion score

A

Item validity index

38
Q

How adequately an item separates or discriminates between high scorers and low scorers on the entire test

A

Item discrimination index

39
Q

What are the key properties of the Item-discrimination index?

A

Symbolised by d

  • Compares performance on a particular item by the high ability group & the low ability group
    (i. e. the top 27% and the bottom 27%)
  • Items that discriminate well will have a high positive score (to a maximum of 1)
  • A negative d value is a red flag as it means low test takers are doing better on that item than high test takers
40
Q

The quality of each alternative within a multiple choice item can be readily assessed with reference to the comparatives performance of upper and lower scorers

A

Analysis of item alternatives (test developer can get an idea of the effectiveness of a distractor by means of a simple EYEBALL Test

41
Q

Graphic representation of item difficulty and item discrimination

A

Item characteristic curve (the steeper the slope, the greater the item discrimination)

42
Q

Test developer addresses the problem of guessing by including in the test manual…

A
  • explicit instructions regarding this point for the examiner to convey to the examinees (ex. instruct answer only if certain)
  • specific instructions for scoring and interpretting omitted items
43
Q

Can be used to identify biased items

A

item characteristic curves

44
Q

Different shapes of item-characteristic curves for different groups when 2 groups do not differ in total test score

A

Differential item functioning

45
Q

Rely primarily on verbal rather than mathematical procedures to explore how individual test items work

A

Qualitative item analysis (thru group discussion, interviews)

46
Q

Approach to cognitive assessment entails having respondents verbalize thoughts as they occur

A

think aloud test administration (one-on-one basis)

47
Q

Conducted during the test development process in which items are examined for fairness to all prospective testtakers and for the presence of offensive language, stereotypes or situations

A

Sensitivity review

48
Q

last step in test development

A

test revision

49
Q

Test revision in the life cycle of an existing test

A

*APA suggests that an existing test be kept in its present form as long as it remains useful but that it should be revised when significant changes in the doman represented or new conditions of test use and interpretation make the test inappropriate for its intended use

50
Q

Revalidation of a test on a sample of testtakers other than those on whom test performance was originally found to be a valid predictor of some criterion

A

cross validation (key step in test development)

51
Q

Decrease in item validities that inevitable occurs after corss-validation of findings

A

Validity shrinkage (is expected and integral to test development process)

52
Q

Test validation conducted on 2 or more test using the same sample of testtakers

A

co-validation (also referred as co-norming)

53
Q

Examiners undergo training of test administration using test manual

A

Quality assurance

54
Q

A test protocol scored by a highly authoritative scorer that is designed as a model for scoring and a mechanism for resolving scoring discrepancies; ensure consistency in scoring

A

anchor protocol

55
Q

A discrepancy between scoring in an anchor protocol and the scoring of another protocol

A

scoring drift

56
Q

Evaluate how well an individual item is working to measure different levels of the underlying construct

A

IRT information curves

57
Q

Item functions differently in one group of testtakers as compared to another group as compared to another group of testtakers known to have the same level of difficulty of the underlying trait (by culture, gender, age)

A

Differential item functioning (DIF)

58
Q

Test developers scrutinize group-by-group item response curves looking for DIF items

A

DIF analysis

59
Q

Items that respondents from different groups at the same level of underlying trait have different probabilities of endorsing a function of their group membership

A

DIF items

60
Q

An advantage of the response format of the test

A

Great breadth (cover many topics)