Lecture 2: Item response theory Flashcards
Describe a typical measurement model
It described the relationship between a construct and other variables, measured by test Bp. The latent variable θ is measured by three items Xp1, Xp2 and Xp3, which each carry with them error Ep1-3.
Distinguish between the four types of latent variable models by how they’re used
Item response theory: Continuous latent variable, categorical observed data
Factor Analysis: Continuous latent variable, Continuous observed data
Latent class analysis: Categorical latent variable, categorical observed data
Latent Profile analysis: Categorical latent variable, continuous observed data
What could scoring on a unidimensional IRT model look like?
The scoring is could be:
correct (0); false (1)
yes (1); no (0)
agree (1); disagree (0)
i.e unidimensional categorical data
Describe a unidimensional IRT model in regards to its function, expected value and when it is suitable.
The measurement model is the expectation of the item score as a function of the latent variable:
E(Xpi | θp) = P(Xpi = 1|θp)
We know that Xpi is categorical and θp is continuous. By definition the expected value of a dichotomous variable (variable that only has 0 and 1 in it) is the same as the probability of scoring a 1 on that variable.
Therefore the focus is finding the probability of answering 1 to an item as a function of the latent trait. That probability as a function of the latent trait follows an S shaped curve (see doc); if you are high on the latent variable (e.g verbal ability) then you will have a score close to 1 (representing the ‘correct’ answer). Since it is a unidimensional model, it assumes that there is only one variable (verbal ability being measured).
Is it common to model unidimensional IRT models?
Yes, it is uncommon to model multidimensional IRT models because IRT is typically used for very strict tests that need to measure one thing, e.g academic tests, however they do exist. They will not be discussed in this lecture however, and will be more discussed in the FA model lectures.
Aside from correct, incorrect exams, when else are IRT models useful
In measuring a trait/ disorder, e.g depression or an opinion e.g death penalty etc.
Since IRT is based on an S-shaped relation between Xpi and θp, we need an S shaped function.
Name the two most popular options for this function.
Normal ogive function (cumulative normal):
𝑓 𝑥 = Φ(𝑥) = 1 / sqrt(2𝜋) 𝑥S−∞ 𝑒^(−1/2)(ℎ^2) 𝑑ℎ
Logistic function:
𝑓 (𝑥) = 𝑒^𝑥/ 1+𝑒^𝑥
𝑓 (𝑥)= 1 / 1+𝑒−𝑥
Don’t have to know models by heart
What does the normal give function do? Why is it suitable for IRT?
𝑓 𝑥 = Φ(𝑥) = 1 / sqrt(2𝜋) 𝑥~−∞ 𝑒^(−1/2)(ℎ^2) 𝑑ℎ
It uses a cumulative normal distribution and gives you the probability of finding a score ‘x’ or smaller (e.g probability table; p score).
If you plot it; put x on the x axis and the result of the function on the y axis you get this s shaped curve. This is logical because it approaches 1. This makes it optimal for IRT.
Explain the logistic function
𝑓 (𝑥) = 𝑒^𝑥/ 1+𝑒^𝑥
𝑓 (𝑥)= 1 / 1+𝑒^−𝑥
𝑒 denotes a number (around 2.72) and this is just an exponential function. 𝑒^𝑥 is denoted to as exp(x).
When do you use
𝑓 (𝑥) = 𝑒^𝑥/ 1+𝑒^𝑥
and when do you use
𝑓 (𝑥)= 1 / 1+𝑒^−𝑥
You can use either of the functions, they both do the same thing. Sometimes it is denoted as the first, sometimes as the second.
Traditionally which function was used and when was did the other function come into play?
Traditionally IRT models have been developed using a normal ogive function. Later the logistic function was used as an approximation as it was easier to compute with.
How was the logistic function adapted to make it closer to the normal ogive function?
If D = around 1.71 then:
𝑓 (𝑥) = 𝑒^D𝑥/ 1+𝑒^D𝑥
is very close to the normal ogive, as it makes the function more steeper (see doc)
Why is the logistic function sometimes preferred to the normal ogive function?
Because the Normal Ogive function is quite complicated and the logistic function is less complex with less variables. This makes it easier to compute and also easier to make derivatives of the model, which is used a lot in modelling.
How useful is the addition of the D to the logistic function and how is it useful?
It is not very useful, it is usually just used by people who want to stick to the old ogive framework. While it doesn’t matter at all; your model will be exactly the same, it will be conceptionally the same, your results will be the same. (Explained just in case you see it in some papers)
What was the first IRT model ever developed? What is it known as now?
The original ‘Rasch model’:
𝑃 (𝑋𝑝𝑖 =1 |𝜃𝑝) = 𝑒^ 𝜃𝑝−𝑏𝑖 / 1+𝑒𝜃𝑝−𝑏𝑖
Also known as the one-parameter model:
𝑃 (𝑋𝑝𝑖 =1 |𝜃𝑝) = 𝑒^ 𝑎(𝜃𝑝−𝑏𝑖) / 1+𝑒^𝑎(𝜃𝑝−𝑏𝑖)
Explain the one parameter model
𝑃 (𝑋𝑝𝑖 =1 |𝜃𝑝) = 𝑒^ 𝑎(𝜃𝑝−𝑏𝑖) / 1+𝑒^𝑎(𝜃𝑝−𝑏𝑖)
𝑒^ 𝑎(𝜃𝑝−𝑏𝑖) / 1+𝑒^𝑎(𝜃𝑝−𝑏𝑖) is the logistic function
Item parameters:
b tunes the location of the s shaped curve on the latent trait dimension (x axis), i.e if someone is high on a latent variable and therefore has a high probability of answering certain questions ‘correctly’. The curve shifts to the left/ right if you change b. b therefore denotes item difficulty: if the curve is more to the right then it is a more difficult item as even those with a high level of the latent trait struggle to get a higher score.
a denotes the slope/ steepness of all the items. The Rasch model does not account for that.
How is score of the latent trait denoted on these graphs?
The mean 𝜃 is 0 and the SD is 1
How can you measure probability using these graphs
You can draw vertically up from the theta score and horizontally across: e.g for bi = 1 and a latent 𝜃 score of 1, the probability for answering the item correctly is 0.5. b is located where the probability of a correct answer is 0.5 for a given score of theta in a two-parameter model.
What does it mean to say that a and bi are fixed effects while 𝜃 denotes random effect?
a and bi are fixed effects while 𝜃 denotes random effect. This means that the latent variable has a (mostly normal) distribution since it is a sample from the population. The a and b parameters that you estimate in your model but typically do not have a distribution since b is about items and items are considered fixed. ai does not typically differ across items; it is fixed
What name is given to this S shaped curves that the normal ogive function and the logistic function are attempting to mimic?
Item Characteristic Curve (‘ICC’)