Study Guide: Test Construction Flashcards

1
Q

6 Steps of Test Construction

A
  • Define Test’s Purpose
  • Preliminary Design Issues
  • Item Preparation
  • Item Analysis
  • Standardization and Ancillary Research
  • Preparation of Final Materials and Publication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Test Purpose

A
  • What will be measured?

* Who is the target audience and does the construct match the group?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Preliminary Design Issues definition

A

Anything that introduces error. Must strike a balance between efficiency and accuracy as well as meet breadth and depth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Examples of Preliminary Design Issues

A
  • Mode of administration
  • Length–longer is more reliable (up to ~15min)
  • Item format (T/F, multiple choice, essay)
  • # of scores (For example, “Depression is multi-faceted and has many scales)
  • Training
  • Background research
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the most important Preliminary Design Issue?

A

Background research!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

4 parts of Item Preparation

A
  • Stimulus
  • Response
  • Conditions governing responses
  • Scoring procedures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Stimulus

A

The question itself is a stimulus.

*We are trying to provoke a specific response correlated with the construct driven ONLY by the stimulus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Response

A

The behavior you are looking for that is correlated to a construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Conditions governing responses

A

What are your rules? Is there a time limit? Are they able to ask questions?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Scoring procedures

A

Formula or rubric used to formulate final scores

*Make sure each facet is represented (weighted)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of Test Items

A
  • Selected-Response Items

* Constructed-Response Items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Selected-Response Items

A

Where you know all possible responses without bias

*T/F, multiple choice, Likert scale, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Constructed-Response Items

A

Responses are unknown/more nebulous

*Essays, oral responses, performance assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Benefits of Selected-Response Items

A
  • One clear answer

* scoring reliability and efficiency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Benefits of Constructed-Response Items

A
  • No agreed-upon answer
  • Bx can give further context
  • Goes deeper into construct
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Item Analysis

A
  • Item Tryout
  • Statistical Analysis
  • Item Selection
17
Q

Item Tryout

A

aka Pilot Test

  • Get subjects similar to target population–cannot be same people used in actual survey
  • 2-3x the items you think you will need.
18
Q

Statistical Analysis

A
  • Difficulty
  • Discrimination
  • Distractor Analysis
19
Q

Item Difficulty

A

% of subjects taking the test who answered correctly

20
Q

Difficulty formula

A

p = # people correct // total

21
Q

What shows good variability for Difficulty?

A

.5

22
Q

Difficulty considerations

A
  • Behavioral measure
  • Characteristic of the item and the sample
  • Extreme p values restrict variability
  • More comparative than a ‘cut-off’
23
Q

Why is Difficulty a behavioral measure

A

It taps into individual differences in holding the construct

24
Q

Item Discrimination

A
  • Assumption that a single item and the test measure the same thing–comparing items to other items within the test
  • Looks at how well any single item is good at discerning who does/does not have a trait
  • You want a high rate!
25
Q

2 Indices of Discrimination

A
  • Index D

* Discrimination Coefficients

26
Q

Index D(iscrimination) formula

A
  • Score and rank
  • Take top and bottom 27%

D= (# correct upper - # correct lower) // # people in larger group

27
Q

Why do we generally focus on just the high/low 27%?

A

Look up

28
Q

Ranges of D(iscrimination)

A

.40 and up = good
.30 to .39 = okay
.20 to .29 = marginal
.19 and below = poor

29
Q

Distractor Guidelines

A
  • Plausible
  • Parallel in structure and grammar
  • Keep everything short
  • Mutually exclusive
  • Alternate placement
  • Limit ‘all of the above’ stuff
30
Q

D values for a distractor

A
  • You want low, preferably negative (this means more in the low group chose it)
  • Zero: it might not be an equally plausible answer
  • Be cautious of large D values as well
31
Q

Why do we want consistency between distractors?

A

Moving away from randomness helps determine true measure of construct.

32
Q

Standardization and Ancillary Research

A
  • Norming
  • Reliability Studies
  • Equating Programs
33
Q

Test Norming

A

Two steps:

  • Define target population
  • Select sample
34
Q

Sampling Methods

A

*Probability and Non-Probability

35
Q

Probability Sampling Methods

A

Every member of Population has a known non-zero chance of being selected

  • Random
  • Systematic
  • Stratefied
36
Q

Non-Probability Sampling Methods

A

Not every member of population has equal chance of being selected…and some have a zero chance of being selected

  • Convenience Sampling
  • Judgement
  • Quota
  • Snowball
37
Q

What is more important: original conceptualization or the technical/statistical work?

A

Original concept!

38
Q

What should you be thinking about, even at the original design stage?

A

Final Score Reports!

39
Q

Does the norming group need to be large?

A

Not if it is properly selected!