Test Development Flashcards

(70 cards)

1
Q

PROCESS OF TEST DEVELOPMENT

A

● TEST CONCEPTUALIZATION
● TEST CONSTRUCTION
● TEST TRYOUT
● ITEM ANALYSIS
● TEST REVISION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

which refers to the preliminary research surrounding the creation of a prototype of the test.

A

PILOT WORK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

the process by which a measuring device is designed and calibrated and by which numbers (scale values) are assigned to different amounts of the trait, attribute, or characteristic being measured.

A

SCALING

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

a grouping of words, statements, or symbols on which judgements of the strength of a particular trait, attitude, or emotion are indicated by the testtaker

A

Rating scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

a type of rating scale wherein the final test scores is obtained by summing the rating across all items (e.g., likert scale)

A

SUMMATIVE SCALE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Developed by Rensis Likert, is a type of summative rating scale in which each items presents the testtaker with 5 alternative responses (sometimes 7), usually on an agree-disagree or approve-disapprove continuum.

A

LIKERT SCALE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is this scale

Select the behavior that you think that best describes you:
a. I enjoy spending time with others
b. I enjoy spending time alone

A

METHOD OF PAIRED COMPARISON

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

a type of comparative scaling wherein the respondents are presented with several itms simultaneously and asked to rank them in the order of priority

A

RANK ORDER SCALE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is this type of scale

CHARACTERISTICS

( ) friendly
( ) jolly
( ) reserved
( ) withdrawn
( ) shy
( ) cheerful
( ) uneasy
( ) hospitable
( ) talkative
( ) different

A

CHECKLIST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

stimuli are placed into one of two or more alternative categories that differ quantitatively with respect to some continuum
● sequence of numbers that identifies items as belonging to mutually exclusive categories.

A

CATEGORICAL SCALING

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

combination of the checklist format and the category format; the subject is given statements and asked to sort them into 9 piles
● statements that are least descriptive of the person are placed on Pile 1 while those that are most descriptive are placed on Pile 9

A

Q SORT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

items on this range sequentially from weaker to stronger expression of the attitude, belief, or feeling being measured.

A

GUTTMAN SCALE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

reservoir from which items will or will not be drawn for the final version of the test

A

ITEM POOL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

refers to the form, plan, structure, arrangement and layout of individual test items.

A

ITEM FORMAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

TYPES OF ITEM FORMAT

A

SELECTED-RESPONSE FORMAT

CONSTRUCTED-RESPONSE FORMAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

require testtakers to select a response from a set of alternative responses

A

SELECTED-RESPONSE FORMAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

CONSTRUCTED-RESPONSE FORMAT

A

require testakers to supply or to create the correct answer, not merely to select it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

presented with two columns: premises (left) and responses (right)
● task is to determine which response is best associated with which premise

A

MATCHING ITEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

a multiple choice item that contains only two possible responses

A

BINARY CHOICE ITEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

usually takes the form of a sentence that requires the testtaker indicate whether the statement is or is not a fact

A

True-False item

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

requires the examinee to provide a word or phrase that completes a sentence

A

COMPLETION ITEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

A test item wherein the testtaker responds to the question by writing a composition which demonstrates recall of facts; understanding, analysis, and/or interpretation

A

ESSAY ITEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

should be written clearly enough so that testtaker can respond with a short answer

A

SHORT ANSWER ITEM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

the ability of the computer to tailor the content and order of presentation of test items on the basis of responses to previous items

A

ITEM BRANCHING

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
also referred to as category scoring, wherein testtaker responses earn credit toward placement in a particular category with other testakers whose pattern of responses is presumably similar in some ways
CLASS SCORING
26
a descriptor used in psychology to indicate a specific type of measure in which respondents compare two or more desirable options and pick the one that is oat preferred (sometimes called “forced choice” scale)
IPSATIVE SCORING
27
Comparing a testaker’s score on one scale within a test to another scale within the same test
IPSATIVE SCORING
28
having created a pool of items from which the final version of the test will be developed, the test developer will try out the test.
TEST TRYOUT
29
It serves as a prototype of the test.
TEST TRYOUT
30
should be executed under conditions as identical as possible to the conditions under which the standardized test will be administered; all instructions and everything from time limits allotted for completing the test to the atmosphere at the test site, should be as similar as possible.
TEST TRYOUT
31
a set of methods used to evaluate test items in order to come up with a cluster of valid and reliable test items
ITEM ANALYSIS
32
METHODS IN ITEM ANALYSIS
ITEM-DIFFICULTY INDEX ITEM-DISCRIMINATION INDEX ITEM-RELIABILITY INDEX ITEM-VALIDITY INDEX
33
an index of an item’s difficulty is obtained by calculating the proportion of the total number of testakers who answered the item correctly
ITEM DIFFICULTY INDEX
34
n item that might be inserted near the beginning of an achievement test to spur the motivation and positive testtaking attitude and to lessen test-related anxiety.
GIVEAWAY ITEM
35
for maximum discrimination among testtakers, approximately about 0.5, with individual items ranging from 0.3 to 0.8
36
ITEM DIFFICULT INDEX INTERPRETATION 0.86 and above
VERY EASY
37
ITEM DIFFICULT INDEX INTERPRETATION 0.71 - 0.85
EASY
38
ITEM DIFFICULT INDEX INTERPRETATION 0.40 - 0.70
DESIRABLE ITEM
39
ITEM DIFFICULT INDEX INTERPRETATION 0.15 - 0.39
DIFFICULT ITEM
40
ITEM-DIFFICULTY INDEX (INTERPRETATION) 0.14 and below
VERY DIFFICULT
41
indicates how adequately an item separates or discriminates high scorers and low scorers on an entire test.
ITEM DISCRIMINATION INDEX
42
symbolizes by a lower italic “d” (d)
ITEM DISCRIMINATION INDEX
43
it compares people who have done well with those who have done poorly on a test. ● difference between the proportion of high scorers answering an item correctly and low scorers answering the item incorrectly ● Upper group = High Scorers ● Lower group = Low Scorers
EXTREME GROUP METHOD
44
ITEM DISCRIMINATION INDEX (INTERPRETATION) 0.40 and above
VERY GOOD ITEM
45
ITEM DISCRIMINATION INDEX (INTERPRETATION) 0.30-0.39
GOOD ITEM
46
ITEM DISCRIMINATION INDEX (INTERPRETATION) 0.20-0.29
MARGINAL
47
ITEM DISCRIMINATION INDEX (INTERPRETATION) 0.10-0.19
POOR ITEM
48
ITEM DISCRIMINATION INDEX (INTERPRETATION) 0 and below
Discard
49
0 and below
Discard
50
p = Very Easy/ Easy/Very Difficult d = Discarded/Poor/Marginal
REJECT
51
p = Easy d= Good/ Very Good
Revise
52
p = Desirable d = Discarded/ Poor
REJECT
53
p = Desirable d = Marginal
REVISE
54
p = Desirable d = Good/ Very Good
ACCEPT
55
p = Difficult d = Marginal
REVISE
56
p = Difficult d = Good/ Very Good
ACCEPT
57
p = Very Difficult d = Good/ Very Good
REVISE
58
assessing the quality of each alternative within a multiple choice item by comparing the performance of upper and lower scorers
ANALYSIS OF ITEM ALTERNATIVES
59
by charting the numbers of testtakers in the U and L groups who chose each alternatives, the test developer can get an idea f the effectiveness of a distractor by means of a simple _________
eyeball test
60
by charting the numbers of testtakers in the U and L groups who chose each alternatives, the test developer can get an idea f the effectiveness of a distractor by means of a simple _________
eyeball test
61
a graphic representation of item difficulty and discrimination
ITEM CHARACTERISTICS CURVE (ICC)
62
techniques of data generation and analysis that rely primarily on verbal rather than mathematical r statistical procedures ● Various nonstatistical procedures designed to explore how individual test items work
QUALITATIVE ITEM ANALYSIS
63
a qualitative research tool designed to shed light on the testtaker’s thought processes during the administration of a test
THINK ALOUD ADMINISTRATION
64
on a one-on-one basis, the examinee will be asked to take a test, thinking aloud as they respond to each item
THINK ALOUD ADMINISTRATION
65
study of the test items, typically conducted during the test development process ● Items are examined for fairness to all prospective testtakers and for the presence of offensive language, stereotypes, or situations.
SENSITIVITY REVIEW
66
Once a test is made available, subsequently, it undergoes refinement.
TEST REVISION
67
A type of revision is development of ________of the original tests
short forms
68
For a Test that has a low internal consistency, you can try ________
Factor Analysis
69
is helpful to increase the reliability of a multivariate/ heterogenous test by identifying the underlying factors
FACTOR ANALYSIS
70
____________ require weighing each item’s content validity, item-difficulty and -discrimination, inter-item correlation, and bias
Choosing the final items