Chapter 6: Writing and Evaluating Test Items Flashcards

Question 1

Q

What are the stages of test development?

Answer

A

Conceptualization, Construction, Test Tryout, Item Analysis, Test Revision

Question 2

Q

Define incremental validity

Answer

A

incremental validity: the extent to which a proposed test provides unique information about a construct relative to that which is offered by existing tests of the same construct (kidney test example)

Question 3

Q

Advantages and Disadvantages of Polytomous Format

Answer

A

Advantages
Easy to administer
Probability of correct answer is lower than true/false (have to know more)

Disadvantages
Hard to write good distractors
Still based on recognition
Reliability - requires excellent distractors

Question 4

Q

Advantages and Disadvantages of Dichotomous Format

Answer

A

Advantages
Simple
Easy to administer
Flexible
Absolute

Disadvantages
Encourage memorization
Truth is sometimes gray
50% probability of guessing correctly
Reliability: requires many items

Question 5

Q

Define dichotomous and polytomous format. Common examples?

Answer

A

Dichotomous format: Belonging to the closed-ended family of questions, dichotomous questions are ones that only offer two possible answers, which are typically presented to survey takers in the following format

Polytomous format: Items that are scored in multiple-ordered categories are referred to as polytomously scored items.

Question 6

Q

Which types of questions are “selected-response format”?

Answer

A

multiple-choice that’s either dichotomous or polytomous

Question 7

Q

What are the two major formats of summative scales, as given in lecture? What type of data do they create?

Answer

A

Likert (nominal data) and Category (On a scale of 1 to 10… that creates interval data)

Question 8

Q

In creating a category format, the use of what will reduce error variance?

Answer

A

Anchors (what are the endpoints, the low and the high)

Question 9

Q

When does the category format begin to reduce reliability?

Answer

A

When less than four categories are used

Question 10

Q

What are the four questions that should be asked when generating a pool of candidate test items?

Answer

A

Cumlative scoring: summing them up

Subscale Scoring: total test scores is divided into groups that is individual summed (think act)

Class or category scaling: Pass or fail, you have it or you don’t

Ipsative Scoring: Forced choice (you have two choices to pick from)

Question 11

Q

Define item analysis. What two methods are closely associated with item analysis?

Answer

A

process which examines student responses to individual test items (questions) in order to assess the quality of those items and of the test as a whole

Item difficulty: The higher the score on the difficulty index, it means that it is easy

item discriminately: Methods to discriminate between high and low scorers

Question 12

Q

Define item difficulty. What does the proportion of people getting the item correct indicate?

Answer

A

a form of item analysis used to assess how difficult items are

the higher the score on the difficult index, the easier the question is

Question 13

Q

Define item discriminability. What is good discrimination? What are two ways to test item discriminability?

Answer

A

how well an item performs in relation to some criterion, high discriminability means that it separates the smart from the dumb

Extreme group method and point biserial method

Question 14

Q

Will guessing help you on an exam?

Answer

A

Depends on how the test is graded. If it is corrected for guessing, than it will do you no good

Question 15

Q

Know and be able to identify examples of a double-barreled item.

Answer

A

You can do this

Question 16

Q

Define item characteristic curve. Know what information the X and Y axes give as well as slope

Answer

Study These Flashcards

A

Does the test item discriminate between the top and bottom quartiles?

x= ability
y= proportion of test-takers getting the item correct

Question 17

Q

When shown an item characteristic curve, be able to determine good or poor discrimination

Answer

Study These Flashcards

A

Positive (good discrimination) and negative (bad discrimination)

Question 18

Q

What is systematic error variance called? Is it good or bad and why?

Answer

Study These Flashcards

A

statistical bias, it is bad

Question 19

Q

Know ceiling effects, floor effects, and indiscriminant items.

Answer

Study These Flashcards

A

Ceiling effect: everyone gets it right
Floor effect: everyone gets it wrong
Indiscriminate: does discriminate between bad and good test-takers

Question 20

Q

Ecological validity (definition and example)

Answer

Study These Flashcards

A

a measure of how test performance predicts behaviors in real-world settings (driving with a steering wheel vs a mouse)

Question 21

Q

Extreme group method

Answer

Study These Flashcards

A

Calculate proportions of people in each group (top and bottom quartiles or thirds)

Calculate the difference between the two groups. The higher the discrimination, the more the item discriminates.

High (near .3) - good discrimination,
close to 0 - no discrimination
Low or negative - reverse discrimination
D(i) = P(t) - P(b)

Question 22

Q

Point Biserial Method

Answer

Study These Flashcards

A

Correlation between item performance and total test performance
Needs lots of items
Closer to 1 = better item
Negative or low = poor item

Chapter 6: Writing and Evaluating Test Items Flashcards

(22 cards)