Chapter 6: Writing and Evaluating Test Items Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are the stages of test development?

A

Conceptualization, Construction, Test Tryout, Item Analysis, Test Revision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define incremental validity

A

incremental validity: the extent to which a proposed test provides unique information about a construct relative to that which is offered by existing tests of the same construct (kidney test example)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Advantages and Disadvantages of Polytomous Format

A

Advantages
Easy to administer
Probability of correct answer is lower than true/false (have to know more)

Disadvantages
Hard to write good distractors
Still based on recognition
Reliability - requires excellent distractors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Advantages and Disadvantages of Dichotomous Format

A

Advantages
Simple
Easy to administer
Flexible
Absolute

Disadvantages
Encourage memorization
Truth is sometimes gray
50% probability of guessing correctly
Reliability: requires many items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define dichotomous and polytomous format. Common examples?

A

Dichotomous format: Belonging to the closed-ended family of questions, dichotomous questions are ones that only offer two possible answers, which are typically presented to survey takers in the following format

Polytomous format: Items that are scored in multiple-ordered categories are referred to as polytomously scored items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which types of questions are “selected-response format”?

A

multiple-choice that’s either dichotomous or polytomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the two major formats of summative scales, as given in lecture? What type of data do they create?

A

Likert (nominal data) and Category (On a scale of 1 to 10… that creates interval data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In creating a category format, the use of what will reduce error variance?

A

Anchors (what are the endpoints, the low and the high)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When does the category format begin to reduce reliability?

A

When less than four categories are used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the four questions that should be asked when generating a pool of candidate test items?

A

Cumlative scoring: summing them up

Subscale Scoring: total test scores is divided into groups that is individual summed (think act)

Class or category scaling: Pass or fail, you have it or you don’t

Ipsative Scoring: Forced choice (you have two choices to pick from)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define item analysis. What two methods are closely associated with item analysis?

A

process which examines student responses to individual test items (questions) in order to assess the quality of those items and of the test as a whole

Item difficulty: The higher the score on the difficulty index, it means that it is easy

item discriminately: Methods to discriminate between high and low scorers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define item difficulty. What does the proportion of people getting the item correct indicate?

A

a form of item analysis used to assess how difficult items are

the higher the score on the difficult index, the easier the question is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define item discriminability. What is good discrimination? What are two ways to test item discriminability?

A

how well an item performs in relation to some criterion, high discriminability means that it separates the smart from the dumb

Extreme group method and point biserial method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Will guessing help you on an exam?

A

Depends on how the test is graded. If it is corrected for guessing, than it will do you no good

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Know and be able to identify examples of a double-barreled item.

A

You can do this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define item characteristic curve. Know what information the X and Y axes give as well as slope

A

Does the test item discriminate between the top and bottom quartiles?

x= ability
y= proportion of test-takers getting the item correct

17
Q

When shown an item characteristic curve, be able to determine good or poor discrimination

A

Positive (good discrimination) and negative (bad discrimination)

18
Q

What is systematic error variance called? Is it good or bad and why?

A

statistical bias, it is bad

19
Q

Know ceiling effects, floor effects, and indiscriminant items.

A

Ceiling effect: everyone gets it right
Floor effect: everyone gets it wrong
Indiscriminate: does discriminate between bad and good test-takers

20
Q

Ecological validity (definition and example)

A

a measure of how test performance predicts behaviors in real-world settings (driving with a steering wheel vs a mouse)

21
Q

Extreme group method

A

Calculate proportions of people in each group (top and bottom quartiles or thirds)

Calculate the difference between the two groups. The higher the discrimination, the more the item discriminates.

High (near .3) - good discrimination,
close to 0 - no discrimination
Low or negative - reverse discrimination
D(i) = P(t) - P(b)

22
Q

Point Biserial Method

A

Correlation between item performance and total test performance
Needs lots of items
Closer to 1 = better item
Negative or low = poor item