Examples Flashcards
Max score 1
Chance 0.5
#cpl 37
AvgSc 0.46
p 0.46
p(corr) -0.08
Rit 0.35
Rir 0.31
Good question!
The p-value is low, but the Rit is very good.
Max score 1
Chance 0.5
#cpl 34
AvgSc 0.79
p 0.79
p(corr) 0.59
Rit 0.17
Rir 0.14
Questionable.
The p-value is close to optimal, but the Rit is low. Since the p-value is alright, it’s not necessary to take it out, but consider revising the question.
Max score 1
Chance 0.5
#cpl 38
AvgSc 0.47
p 0.47
p(corr) -0.05
Rit 0.26
Rir 0.22
Questionable.
The p-value is low and the Rit is a bit low. Consider revising.
Max score 2
Chance 0.33
#cpl 48
AvgSc 1.78
p 0.89
p(corr) 0.84
Rit 0.16
Rir 0.13
Distractor A: 0/48
Distractor B: 47/48
Distractor C: 11/48
Bad question?
The p-value is close to the maximum and the question does not discriminate (close to 0).
However, it is not necessary to remove it, as long as there are not too many very easy questions. Consider checking the content and distractors (namely A).
Max score 2
Chance 0.33
#cpl 48
AvgSc 1.52
p 0.76
p(corr) 0.64
Rit 0.44
Rir 0.39
Distractor A: 5/48
Distractor B: 41/48
Distractor C: 13/48
Good question!
The p-value is maybe a touch high for a 3-option question, but the Rit value isvery good. The distractors are quite well-balanced.
Max score 4
Chance 0.5
#cpl 48
AvgSc 3.83
p 0.96
p(corr) 0.92
Rit 0.15
Rir 0.08
Bad question?
The p-value is very high (= easy item) and the Rit value is low. It’s not necessary to take it out unless there are many other very easy questions.
A test has 20 candidates and 60 multiple-choice questions. Which values are reliable and which are unreliable?
With 20 candidates, the p-value is less reliable than with a large group.
With 60 questions, it makes more sense to use Rir than Rit.
With 60 questions, Cronbach’s alpha or KN-20 are pretty reliable.
What do you think of this exam?
Number of questions: 40
Number of candidates: 10
Average score: 25.2/40 (63%)
Highest obtained score: 34
Lowest obtained score: 20
Average p/p’: 0.63
Average p(corr): 0.51
Reliability Alpha/KR-20: 0.72
- With 10 candidates, the p-value may be less reliable
- With 40 questions, Cronbach’s alpha is on the lower end of reliable
- The lowest score is 50%, so I’d look at how many of the individual items have a high p-value
- The average corrected p-value seems on the low side, but there are only 10 candidates
- The alpha-value is just about sufficient, but since there are only 40 questions, this might, maybe be less reliable