How to create a test Flashcards
Statement 1: The item discrimination index for reliability tells you the extent to which people are responding to that item in the same way as they are responding to the other items in the scale.
Statement 2: If the item discrimination index for reliability is -1 (minus one) then it means that the item in question cannot distinguish between high and low scorers IN ANY WAY.
(a) Both statements are true.
(b) Statement 1 true; Statement 2 false.
(c) Statement 1 false; Statement 2 true.
(d) Both statements are false.
The answer was b. See Lecture 5. Statement 1 is true. The item discrimination index for reliability does indeed show you how peoples’ responses to that item correlate with their responses to the other items in the scale. Statement 2 is false. If the item discrimination index for reliability is 0 (ZERO) then it means that the item in question cannot distinguish between high and low scorers. An item discrimination index of -1 (MINUS ONE) means it is discriminating - just in the opposite way that you would predict (it probably means it was a reverse-scored item and you forget to recode it).
Statement 1: On a norm-referenced aptitude test, the ‘optimal difficulty’ of an item is defined as a point halfway between chance and everyone getting the answer WRONG.
Statement 2: The item discrimination index for validity tells us the extent to which the item contributes to the scale’s correlation with a relevant criterion measure.
(a) Both statements are true.
(b) Statement 1 true; Statement 2 false.
(c) Statement 1 false; Statement 2 true.
(d) Both statements are false.
The answer was c. See Lecture 5. Statement 1 is false. The optimal difficulty of an item on a norm referenced aptitude test is actually halfway between chance and everyone getting the answer CORRECT (not wrong). That’s why the optimal difficulty calculation requires you to add the chance of guessing to 100% before dividing by 2 to find the halfway point. Statement 2 is true – the higher the item discrimination index for validity, the more that item is contributing to the scale’s relationship with the external criterion measure in question.
The following frequency data refers to a question on a four option multiple-choice examination; the options are denoted (i), (ii), (iii), (iv) below. Students are divided into an Upper Group (top third of class based on overall exam score) and a Lower Group (bottom third of class based on overall exam score).
Upper Group: (i) 12, (ii) 35, (iii) 0, (iv) 3.
Lower Group: (i) 23, (ii) 21, (iii) 0, (iv) 6.
For example, this data indicates that 12 people from the Upper Group chose Option (i). What is the item discrimination index for this item if the correct answer is Option (ii)?
(a) .28
(b) .60.
(c) .23.
(d) .14.
The answer was a. Here’s what the calculation should have come out like:
U = 35
L = 21
nU = 50 (summing across the top row: 12 + 35 + 0 + 3)
nL = 50 (summing across the bottom row: 23 + 21 + 0 + 6)
Using the item discrimination index formula from Lecture 5, your calculation should therefore look like this: d = (35/50) - (21/50) = .28
Statement 1: For speed tests, it is not appropriate to use Item Discrimination Indices but Item Difficulty Indices can be used.
Statement 2: For power tests, it is appropriate to calculate both Item Discrimination and Item Difficulty Indices.
(a) Both statements are true.
(b) Statement 1 true; Statement 2 false.
(c) Statement 1 false; Statement 2 true.
(d) Both statements are false.
The answer was c. Statement 1 – false. You can’t use Item Difficulty Indices on speed tests because whether people get them correct or incorrect depends on where they are in the order of items not how difficult they are (where all items are very easy in a speed test but there are just a lot of them to complete). The first bit of this statement is correct - item discriminate indices would indeed be inappropriate for a speed test (this is a distractor)- but the bit about the item difficulty index is false and hence it means that the overall statement is false. Statement 2 – true. You can typically use both item discrimination and item difficulty indices in power tests – because it is the difficulty of the questions that is doing the job of separating high scorers from low scores (rather than how fast they can answer the items).
What is the optimal item difficulty index for two option multiple-choice questions in a norm-referenced achievement test?
(a) .50.
(b) .75.
(c) .60.
(d) .63.
The answer was b. See Lecture 5. Chance for a 2 option multiple choice is 50%. (50% + 100%)/2 = 75%.
The following frequency data refers to a question on a four option multiple-choice examination; the options are denoted (i), (ii), (iii), (iv) below. Students are divided into an Upper Group (top third of class based on overall exam score) and a Lower Group (bottom third of class based on overall exam score).
Upper Group: (i) 16, (ii) 6, (iii) 5, (iv) 7.
Lower Group: (i) 3, (ii) 23, (iii) 4, (iv) 4.
For example, this data indicates that 16 people from the Upper Group chose Option (i). Which option describes this data the best, assuming the examiner has designated Option (i) to be the correct answer?
(a) It is an easy item, but not problematic.
(b) It is a difficult item, but not problematic.
(c) Option (ii) may be inappropriately misleading.
(d) There are grounds to suspect an error in scoring.
The answer was b. It is a difficult item, but not problematic because, in the upper group, a greater proportion of people are choosing the right answer. The fact that the lower group are going for a different option is fine – because they are probably being appropriately misled as a result of not knowing the material as well. That is, option c is not the correct answer because only people in the lower group were fooled by it (i.e. people you’d expect to be fooled). It would only be problematic if lots of people in the upper group were fooled by it (i.e. people you wouldn’t expect to be fooled).
An examination has an average item difficulty index of .95. Which of the following is most likely given this information:
(a) The examination is very difficult.
(b) The examination can discriminate very well between high and low scorers.
(c) The examination is very easy.
(d) The examination cannot discriminate very well between high and low scorers.
The answer was c. The item difficulty index tells you what percentage of test-takers got a question correct. This means, on average, 95% of test-takers were getting the questions correct and hence the examination must have been very easy. To gauge the ability of the exam to discriminate between high and low scorers, we instead need the item discrimination index.
The item discrimination index, where the Upper and Lower Groups are defined by overall test score, can tell us how that single item contributes to a test’s:
(a) Criterion validity.
(b) Internal consistency.
(c) Alternate forms reliability.
(d) Content validity.
The answer was b. See Lecture 5 - the question is describing the Item Discrimination Index for reliability - which tells you the extent to which the item is yielding a similar response to the other items in the scale.
The following frequency data refers to a question on a four option multiple-choice examination; the options are denoted (i), (ii), (iii), (iv) below. Students are divided into an Upper Group (top third of class based on overall exam score) and a Lower Group (bottom third of class based on overall exam score).
Upper Group: (i) 34, (ii) 89, (iii) 21, (iv) 3.
Lower Group: (i) 67, (ii) 43, (iii) 7, (iv) 29.
For example, this data indicates that 34 people from the Upper Group chose Option (i). What is the item discrimination index for this item if the correct answer is Option (ii)?
(a) .46.
(b) .32.
(c) .52.
(d) .16.
The answer was b. See Lecture 5.
n in Upper Group = 147 (34+89+21+3)
n in Lower Group = 146 (67+43+7+29)
U = 89
L = 43
Applying Item Discrimination Index formula: (89/147) - (43/146) = .61 - .29 = .32
Note if you got .31 instead (or similar), this is just due to slightly different rounding strategies (i.e. it’s still correct - where none of the distractors were anywhere near this value).
The following frequency data refers to a question on a four option multiple-choice examination; the options are denoted (i), (ii), (iii), (iv) below. Students are divided into an Upper Group (top third of class based on overall exam score) and a Lower Group (bottom third of class based on overall exam score). Option (ii) is designated as the correct answer.
Upper Group: (i) 7, (ii) 12, (iii) 0, (iv) 31.
Lower Group: (i) 9, (ii) 35, (iii) 0, (iv) 6.
For example, this data indicates that 12 people from the Upper Group chose Option (ii).
Statement 1: This item appears to have a redundant distractor.
Statement 2: There are grounds for suspecting that this item might contain a scoring error or be worded in a misleading way.
Selected Answer:
Correct
(a) Both statements are true.
(b) Statement 1 true; Statement 2 false.
(c) Statement 1 false; Statement 2 true.
(d) Both statements are false.
The answer was a. Statement 1 is true. Nobody in either the upper or lower groups choose option (iii), suggesting that it was obviously incorrect and might be a redundant distractor (in that it failed to distract anyone). Statement 2 is true. Most people in the Upper Group choose option (iv) rather than the option designated as “correct” (i.e. option ii), despite most people in the Lower Group choosing option (ii). This raises the suspicion that there might be a scoring error or some problem with the question - and hence that it should be double-checked. This issue would also be flagged by the fact that this item would yield a negative item discrimination index.