Lecture 4 Modern Test theory Flashcards
From classical test theory to modern test theory
What are classical test theory advantages?
- Allows for calculating relability
- Intuitive and easy to apply
- It’s in SPSS and it’s easy to do in Excel
- No large sample sizes/many items needed
What are the disadvantages of classical test theory?
- Focus on the test, not on the items
- Test properties depend on the population (e.g. reliability and difficulty of a test should be generalisable to different populations)
- Person properties depend on the test (i.e. sum score is higher if the test is easy and lower if the test is difficult)
Modern test theory adresses these disadvantages
What is Modern Test Theory? What is the assumption when calculating this theory?
Specify a measurement model in which we mathematically link the item scores to the construct (= latent variable/latent trait/factor)
- The idea of reflective measurement - the construct affects the item scores
Assumption:
- Unidimensionality = you only measure 1 construct
Book uses trait level, Dylan uses latent variable, but it’s the same thi
What is an Item Response Theory?
Specific form of modern test theory where there is a specific mathematical link between the latent variable and the item
- Individual’s response to a particular test item is influenced by qualities of the individual (trait level) and by qualities of the item (difficulty level)
What does each variable mean on the graph demonstrating item response theory?
Picture 1
X-axis = the latent variable (trait level → level of the relevant psychological construct)
- Each subject has a position on the latent variable
- 0 on the x axis is the average of the latent variable (person has 50% chance of answering an item correctly)
Y-axis = probability of the correct response (0 to 1, 1 being the correct answer)
P(Xis = 1|…) → the probability that a correct response will be made by a particular individual when answering a particular item
What is the name of the function? Why is it helpful?
The graph is a Logistic (s-shaped) function
- Runs from 0 to 1 - exactly what we need because we are modellling the probability of a correct response and thanks to that accounting for measurement error
- Do this through Item characteristic curve
What is an item characteristic curve?
A graphical display linking respondents’ trait levels to the probability of correctly answering an item
- There is a curve like this for every item
How does the position of the curve change with different difficulty?
All the way to the left
- very easy item because the person who is below average on the latent variable has a probability of answering correctly very close to 1
All the way to the right
- very difficult item because the person who is above average on the latent variable has probability close to 0
The position of the curve on the latent x axis, depends on how difficult the item is
What is the Rasch model? What is its function formula and what do the variables mean?
Picture 2
Function formula:
𝑃(𝑋𝑖𝑠 = 1|𝜃𝑠, 𝛽𝑖) =(𝑒^(𝜃𝑠−𝛽𝑖))/(1 + 𝑒^(𝜃𝑠−𝛽𝑖))
- P(𝑋𝑖𝑠 = 1|𝜃𝑠, 𝛽𝑖) → the probability that subject s will respond correctly to the item i correctly
- The probability of a correct response only depends on the latent variable and item difficulty
What do the variables in Rasch’s model function mean?
- 𝑋𝑖𝑠 = 1 → ‘correct’ (1) response (X) t the item (i) by a subject (s)
- 𝛽 → the difficulty of an item, can be any positive or negative number (or zero)
- The larger 𝛽 value, the more difficult the item is
- 𝜃 → latent variable (tells us how well a certain subject scores on the variable)
- e = 2.72 (base of natural logarithm)
What is the Two Parameter logistic model (2PL)? How does it differ from Rasch Model?
Similar to the Rasch model (s-shaped function, 𝛽 parameter)
Formula: 𝑃(𝑋𝑖𝑠 = 1|𝜃𝑠, 𝛽𝑖, 𝛼𝑖) = (𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖)))/(1 + 𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖)))
The probability of a respondent answering an item correctly is conditional on the respondent’s trait level (latent variable), the item difficulty and the item’s discrimination
Now we have α𝑖 parameter = item dicrimination
What is an item discrimination?
The steepness of the ICC indicates the item’s ability to discriminate between individuals with different trait levels
- It indicates the relevance of the item to the latent variable being measure by the test
How can the number of item discrimination be interpreted? Is it mostly positive/negative?
- Mostly positive number but can be negative for contra-indicative item
- The larger the number the better the test can detect differences = strong consistency between an item and the underlying latent variable
↪ The steeper the curve (larger number of α𝑖), the more different do the two subjects score on the test (bigger difference in probability) even though they might be very close on the latent variable (picture 3)
What is the Three Parameter logistic model?
Picture 4
𝑐𝑖 = guessing value → lower-bound probability of a correct answer purely on the basis of chance
𝑃(𝑋𝑖𝑠 = 1|𝜃𝑠, 𝛽𝑖, 𝛼𝑖, 𝑐𝑖 = 𝑐𝑖 + (1 − 𝑐𝑖) * (𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖)))/(1 + 𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖)))
So the curve doesn’t start at 0 because people are assumed to guess on the more difficult items
What does the guessing value depend on?
Depends on the number of response options available (4 options → guessing will produce a correct answer 25% of the time so c = 0.25)
What is the Graded Response model (GRM)?
Picture 5
A model for a likert scale (the other ones are for binary items)
What is the formula function of GRM and how does it differ from 2PL?
- Separate item characteristic curve for each response option → 𝑃(𝑋𝑖𝑠 > 𝑗|𝜃𝑠, 𝛽𝑖𝑗, 𝛼𝑖) = (𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖𝑗)))/(1 + 𝑒^(𝛼𝑖(𝜃𝑠−𝛽𝑖𝑗)))
- The function formula is the same as for 2PL but now the item difficulty (𝛽) is specific to each response option
What does the graph of GRM show? Use Dom as an example
Picture 5
- The characteristic curve of each response option is positioned on the latent variable based on each difficulty
- The probability of each response option for each person (on the latent variable axis) is shown in this model
- the smiley face (let’s call him Dom) has the probability of ~ 0 to choose option 2, ~ 0.2 to choose option 5, and ~ 0.5 to choose option 4…