Item analysis Flashcards
What makes a good item?
1) item difficulty
2) item discriminability
3) characteristic curve
4) item response theory (IRT)
5) criterion-referenced test
What is meant by item difficulty/ facility index?
The propotion of people who gets a particular item correct. A higher p-value means a easier item
How do we calculate item difficulty ?
The number of items correct divided by the number of participants
What is the Optimim difficulty level ?
Between 0.30 and 0.70
What is meant by item discriminability and how do we measure it ?
Have those who did well on a item, also do well in the overall test.
We use the Discrimination index (di) where higher values mean better discriminability
And
The point-biseral method
What is the Extreme groups method ?
The proportion of people in the upper quartile who got a item correct minus the proportion of people in the lower quartile who got the item correct
Or
The difference in item difficulty when comparing top and bottom 25% (in terms of overall marks)
If the Point-Biseral method (Item-total correlation) is negative, what does this mean ?
That a test taker failed a specific item, but did well in the overall test
What is meant by a Item Characteristic Curve (ICS) ?
It is the relationship between performance on an item and the performance on the overall test
It is a graphical display of item functioning with the x-axis being the total test scores and the y-axis being the proportion getting the item correct
What is the Item-response Theory (IRT) ?
It is a different model of psychological testing. It makes extensive use of item analysis. A computer generates items, with each item having a particular difficulty level. If you answer correctly, the next item given will be a increased difficulty and vice versa. The computer gives you what it thinks you can handle and is ‘tailored’ to each individual
What is the IRT defined by ?
It is defined by the level of difficulty of the items answered correctly. It is done through adaptive computer-based testing
What are the advantages of IRT ?
1) Tests based on IRT can easily be adapted for computer administration
2) Quicker tests
3) Morale of test-taker is not broken down
4) Reduces chances of cheating
What are criterion-referenced tests ?
It compares the performance with some objectively defined criterion. Tests are developed based on learning outcomes i.e what is it that the test-taker should be able to do ?
What are the limitations to criterion-referenced tests ?
1) tells you that you got something wrong, but not why
2) emphasis on ranking students rather than identifying gaps in knowledge
How do we calculate optimum difficulty ?
Halfway between 100% minus chance of getting a item correct plus the chance of getting a item correct
What exceptions are there to ODL?
- at times we need more difficult items e.g., selection process
-at times we need more easier items e.g., special education
-at times we need to consider other factors e.g., boost morale