Psychometrics - Test Items Flashcards

1
Q

Where does the choice of format come from?

A

Objectives and purposes of the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Eight tips for item writing

A
  1. Define clearly what you want to measure
  2. Generate an item pool
  3. Avoid long items
  4. Keep the reading difficulty appropriate
  5. Use clear and concise wording
  6. Mix positively and negatively worded items
  7. Cultural neutrality
  8. Make the content relevant to the purpose
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Five categories of item format

A
  1. Dichotomous
  2. Polytomous
  3. Likert
  4. Category
  5. Checklists + Q-sorts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Examples of dichotomous format

A

True-False questions and Yes-No questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pros and cons of Dichotomous format

A

+ Easy to administer and score
+ Participants can’t opt for neutral
- Less reliable, due to less range of scores
- Encourages memorization in test setting
- Doesn’t account for the fact that the truth is often in shades of grey, not black and white

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example of polytomous format

A

MCQs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Tips for writing distractors

A

Minimum of three distractors (so 4 options in total), less is an issue due to too few options and more is an issue due to the difficulty of writing viable options. They need to be as plausible as the correct answer and avoid cute distractors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Pros of polytomous format

A

+ Easy to administer and score
+ Requires absolute judgement
+ More reliable than dichotomous bc less change of guessing correctly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why do we do correction for guessing?

A

With dichotomous and polytomous formats, it’s easy for people to guess the right answer, which doesn’t indicate good performance, just luck.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Formula for correction for guessing

A

Right-(Wrong/(number of alternatives - 1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Debate around the neutral option in Likert format

A

It allows for people to opt for an indecisive option instead of working on the answer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When do we use Likert?

A

When measuring the indicator of the degree of agreement, used in things like attitude or personality tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Problems with category format

A
  • Tendency to spread responses across all categories
  • Susceptible to the context
  • Element of randomness
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When do we use the category format?

A
  • When people are highly involved with a subject

- When wanting to measure the amount of something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Checklists vs Q-sorts

A

Checklist: you check the things on a list that apply to you

Q-sort: Places statements into piles the best describe them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

5 steps of item analysis

A
  1. Item difficulty
  2. Item discriminability
  3. Item characteristic curves
  4. Item response theory
  5. Criterion referenced tests
17
Q

What is item difficulty?

A

Proportion of people who get the particular item right, so the higher the value, the easier the item.

18
Q

What is the optimum difficulty level?

A

Between 0.3 and 0.7

19
Q

How do we calculate optimum difficulty?

A

Halfway between 100% getting the item right and the level of success estimated by guessing, added to chance.

20
Q

Example of OLD equation in a 4 option MCQ

A

4 options = 0.25 chance of guessing right

  1. (1-0.25)/2 = 0.375
  2. 0.25 + 0.375 = 0.625

0.625 is our ODL

21
Q

When do we need to make exceptions about the ODL

A

More difficult: selection processes
Easier: special education
Others: boosting morale etc.

22
Q

What does item discriminability tell us?

A

Have those who have done well on the particular item done well on the test?

23
Q

What does the extreme groups method do?

A

Proportion of students in the upper extreme who got it right - the proportion of students in the lower extreme who got it right

24
Q

What does the point-biserial method do?

A

It gives us the item-total correlation; if there is a correlation between how well people did on the item and how well they did on the test as a whole

25
Q

Formula for point-biserial method

A

Correlation = ((Yn - Y)/Sy)(SQRT(Px/(1-Px))

Yn = mean tests score for those who got the item right 
Y = total mean 
Sy = stddev of all test takers 
Px = proportion of test takers who got the item correct
26
Q

What can we do with the point-biserial correlation?

A

Use it to weed out bad questions after the pilot study; include items with higher correlation and exclude the lower.

27
Q

What is plotted on the axes of the item characteristic curve?

A

X - total test score

Y - proportion getting the item correct

28
Q

How does item response theory work?

A

A computer generates items for the test taker depending on their performance on the previous items, with the outcome being defined by the level of difficulty of the items answered correctly.

29
Q

Pros of the item response theory

A

+ Tests based on IRT are great for computer administration
+ Quicker
+ Morale is maintained
+ Reduces cheating

30
Q

Three kinds of measurement precision

A
  1. Peaked conventional: best for measuring the middle bracket, where average scores sit, it’s not good at measuring top and bottom achievers
  2. Rectangular conventional: equal number of items assessing all ability levels, but relatively low precision across the board
  3. Adaptive: test focuses on the range that challenges each individual test taker, making for overall high precision
31
Q

What is a criterion-referenced test designed to do?

A

Compare test performance with some objectively define criterion, tests developed based on learning outcomes

32
Q

How does one evaluate a criterion reference test?

A

Assess two groups, one given the learning unit and the other not. Collect their scores and put them on a graph.

33
Q

Limitations of the criterion referenced test

A
  • Tells you what you got wrong, but not why
  • Emphasis on the ranking of students, not identifying what gaps in their knowledge exist
  • It increases the risk of “teaching to the test”