7 Test Theory Flashcards
In Classical Test Theory, what is the definition of item difficulty (p-correct)?
The proportion of people passing or ‘endorsing’ the item. The easier the item is to endorse, the higher the p value.
CTT: What is the p-correct in the following example?
I had two apples and Mary gave me three more apples, how many apples would I now have?
A) one apple n=30
B) five apples n=150
C) about 1kg of apples n=10
D) too many apples n=10
150/200 = .75 p = .75
In what way is item difficulty sample dependentin CTT?
Because the difficulty of an item depends on who is trying to solve it – e.g., 3-yr-olds vs. undergrads.
What is the ‘best’ item difficulty level in CTT? And why?
.5 –half the people get it correct
Why? This provides the maximal variation in test scores, and the best possibility of discrimination between participants.
What is wrong, from an item difficulty perspective, with items that everyone gets right or wrong?
Such items do not discriminate between individuals.
In CCT, p-correct is a feature of the ________, NOT an _______ characteristic of the item
In CTT, p-correct is a feature of the sample, NOT an absolute characteristic of the item.
In CTT, what is the Extreme Groups Approach to item discrimination?
Split the scores into two groups (e.g., top/bottom 25%)
D = P-upper – P-lower
Maximum discrimination is 1.
What is item-total correlation and what does it tell us?
The correlation between score of one item and the total score of all OTHER items.
It tells us the extent to which the single item is measuring the same thing as the total test.
What are inter-item correlations for?
They tell us whether items that should be measuring the same thing are related to each other.
What are four problems with classical test theory?
- Reliabilities and validities are derived from the total score
- Item selection is not fundamental
- Statistics are sample-dependent
- There is, in reality, no conceptual link between difficulty and discrimination (but CTT assumes there is)
In Item Response Theory, what forms the basis of estimating the trait level?
The person’s response pattern to a particular set of items. e.g. what level of extraversion is most likely to have led to this pattern of responses.
How does IRT scale the difficulty of an item?
As how much ability you need to get the item correct.
What do probability curves in IRT model?
The probability of getting an item correct given ability (N of items correct).
Are probability curves in IRT linear? Do they cross?
No, they are curvilinear. No they don’t cross.
What does the IRT model describe?
How changes in level of a trait relate to changes in probability of a certain response.