Lecture 8 Flashcards
What are the three main things Item Response Theory addresses?
- test bias
- adaptive testing
- item selection
What is a key skill in professional psychology?
measurement
What is Classical Test Theory often know as?
the theory of total scores
What is the problem with CTT?
we can only observe the test score, we cannot see depression for example
What is central to IRT?
the relationship between the item and the overall construct being assessed- the thing that we cannot really see. It assumes that there is a relationship between responses to items and the underlying/latent dimension assessed by the scale.
In CTT the estimates of the test and item parameters are dependent on?
the sample from which they were calculated- so what is the relevance to a clinical sample when the test has been derived from undergraduates.
What is an advantage of CTT?
scoring is usually simpler
What does an item characteristic curve do?
describes the relationship between the probability of a correct response on a true/false item and the probability of having the underlying dimension
What are the numbers on the y axis (probability of responding)?
0 (very unlikely) to 1 (certain)
What are the numbers on the x axis (underlying/latent dimension)
anything- we make it up. There are no units for latent dimensions. Usually a mean of 0 and a SD of 1.
IRT are sample invariant, what does this mean?
they do not depend on the sample they are drawn- e.g. we can still get useful information clinically if it has only been tested on undergraduates
IRT are uni….., what does this mean?
unidimensionality- only assess a single construct.
IRT has local… what does this mean?
local dependence- has items that are assessing the same thing but are not the same item, they is just enough difference to be usueful
In the Item characteristic curve, what is the slope an estimate of?
discrimination
In the Item characteristic curve, the point on the X axis is an estimate of?
the difficulty or threshold
In the ICC, if the curve is more to the left or more to the right, what does this mean?
more to right= harder difficulty, better discrimination, more to the left = easier difficulty, less discrimination
In the ICC, a steeper slope means?
more discriminating
What is a possible third parameter for the ICC (after discrimination + difficulty)
pseudo-guessing- estimates the probability of a response for people with very low levels of the underlying dimension
What did Georg Rasch propose was the best way to make a questionnaire?
have items that have similar discrimination that differ in difficulty
What does ICC for multiple choice items plot?
a separate curve for each response
What is non-parametric IRT?
you are not reading the data with assumptions of what it’s going to say, you are taking it for what it is
Discuss the National Survey of Mental Health and Wellbeing
- questioned 7746 people who reported having 12 or more drinks in the past 12 mths.
- were asked about alcohol dependence + abuse
- 1.9% met criteria for alcohol abuse
- 4.9% met criteria for alcohol dependence
What information do we get on the alcohol survey if we use Classical Test Theory?
- cronbach alpha (measure of internal consistency)
- note that the relationship b/w an item and the total score is expressed by a single number (item total correlation)
- the closer the ITC gets to 1 the stronger the relationship between the item and the total score
What information do we get on the alcohol survey if we use Item Response Theory?
larger, cutdown & tolerance are the most helpful items to decide whether the person has a problem or not
shows that legal problems is not predictive of mental illness
Can you use just a subset of the items (tolerance, withdrawal, large cutdown) to give adequate information?
yes
Discuss the effect of the number of parameters estimated
1- Rasch model. assumes all items have same slope or discrimination and differ only on difficulty or threshold.
2- include difficulty and discrimination
3- add parameter for pseudo-guessing
What is item bias?
looking inside a questionnaire and be able to see if each question is as easy/fair as it could be e.g. do individuals with the same level of depression answer/respond differently to questions re depression?
What are 3 ways to fix item bias?
- come out with different scoring methods
- reword items
- decide that it isn’t a good item and not use it anymore
What is field testing?
IRT requires larger samples but the randomness of the sample is less important bc IRT is sample invariant
What can the selection of test items be on the basis of?
achieving the desired test information function
What is item banking and adaptive testing?
once the IRT parameters are known from a large sample it is possible to choose items that provide the best estimate of a persons level on the latent dimension e.g. maybe we don’t have to ask everyone the same questions
Where do you start when trying to formulate adaptive testing?
the mean (an item with high discrimination and average difficulty)
What is the pyramidal testing model show?
how an individual might pass through a certain number of items (predicts what answers they might choose)
what is CAT logic
- begin with an initial score estimate> select & present optimal scale item > score response > re-estimate health score and confidence interval