IRT Flashcards by Laura Humez

IRT’s desirable objectives (2)

(1) Administer SHORTER measures
(2) Compare scores across: DIFF measures of the SAME constructs in DISTINCT groups

How well did you know this?

Not at all

Perfectly

Why is there a problem in administrating shorter measures according to CTT?

Problem bc relationship between LENGTH of test & RELIABILITY of test
-> Shorter test don’t have as high reliability as longer

How well did you know this?

Not at all

Perfectly

Limitations of CTT (3)

(1) Adding/deleting items changes true score (because the true score is TEST-DEPENDENT, so comparison not possible across diff test forms)
(2) True score is interpretable ONLY in reference to NORM sample’s distribution of scores: SAMPLE-DEPENDENT
(3) Reliability of true score is function of the items used: All items of EQUALLY reliable, measure SAME RANGE of scores, reliability CONSTANT across scores

How well did you know this?

Not at all

Perfectly

What’s the problem with CTT assumption that “Reliability of true score is function of the items used”?

In practice, some items are better than some others

How well did you know this?

Not at all

Perfectly

Item Response Theory (IRT) Assumptions (4)

(1) True score defined on the LATENT trait dimension rather than observed score
(2) Knowing **PROPERTIES OF ITEM **a person endorses tell us the TRAIT LEVEL the person possesses
(3) Properties of an item do NOT change if we were to administer the item using different samples
(4) True score of the person does NOT change regardless of which sets of items we administer.

How well did you know this?

Not at all

Perfectly

In IRT, we place both _______ and _______ on the same scale to be able to compare those two.

items characteristics; people characteristics

How well did you know this?

Not at all

Perfectly

IRT is a family of mathematical models that describe the probability of a given response to an item as a function of _______________ and ____________. It models the _______________.

certain item characteristics; respondent true score; likelihood of you endorsing an item

How well did you know this?

Not at all

Perfectly

IRT: What’s the chance you’re gonna answer YES to an item assessing HIGH attachment levels?

KNOWING what’s the level of attachment of an ITEM
+
Underlying level of INDIVIDUAL attachment
= likelihood of you saying yes.

How well did you know this?

Not at all

Perfectly

Item Response Function

Representation of the probability of item endorsement across the range of true scores
=> models the likelihood of item endorsement across the entire range of underlying traits

How well did you know this?

Not at all

Perfectly

IRT: TRUE SCORE =

PROB OF ENDORSING ITEMS WITH SPECIFIC CHARACTERISTICS given the trait level set.

How well did you know this?

Not at all

Perfectly

Item Characteristic Curve (ICC)

Function that models the likelihood of endorsement => plot of the Item response function

How well did you know this?

Not at all

Perfectly

Item Response Function

Probability that a person with a given ability level will answer CORRECTLY.
=> EQUATION that relates true score (theta) defined in latent dimension to the probability of endorsing an item.
=> DIFF CURVES FOR DIFF ITEMS!!!

How well did you know this?

Not at all

Perfectly

Variables in Item Response Function

Y = Probability of item endorsement (“yes”) = HOW MUCH TRAIT LEVEL YOU POSSESS
X = Theta (latent trait) - e.g. entire range of math level
Theta is a CONTINUUM (from -infinity to +infinity)

How well did you know this?

Not at all

Perfectly

Theta def + values

Entire range of latent trait.
=> CONTINUUM (from -infinity to +infinity)
=> Negative values = LOW levels
=> Positive values = HIGH levels

How well did you know this?

Not at all

Perfectly

How does a typical ICC looks for items that are dichotomous (yes-no)?

S shape

How well did you know this?

Not at all

Perfectly

Whare are item characteristics?

Item DIFFICULTY & Item DISCRIMINATION

How well did you know this?

Not at all

Perfectly

What’s the “nature” of the ICC function?

MONOTONIC: Probability of item endorsement increases in theta.

How well did you know this?

Not at all

Perfectly

ICC: In the middle of the curve, ____ changes in theta correspond with ___ changes in probability

small; large

ICC limited by 0 and 1, why?

Bounds of probability: You can never say a probability is ZERO or ONE (impossible); never reaching those two points

Item Difficulty def

b
The point in theta (X axis) where probability of endorsing an item is 50%.
=> To find it, start by checking 0.5 in the Y axis
=> Then you find what’s the level of theta (X) that correspond to item difficulty

Item difficulty typically range between ______

– 2 and + 2
(-/+ 2 = Arbitrary z-score)

Item difficulty:
=> NEGATIVE difficulties = _____
=> POSITIVE difficulties = ______

Items are “EASIER”, more frequently endorsed (doesn’t take much of the trait level to endorse);
Items are more “DIFFICULT”, less frequently endorsed

Item difficulty: What does it mean if Theta > b

Items more likely to be endorsed
=> When theta level is HIGHER than difficulty of the item

Item difficulty: What does it mean if Theta < b

Items less likely to be endorsed
=> When level of underlying trait LOWER than item difficulty

Theta = b

= 50%; item difficulty

Item Discrimination

**a** Value of the slope at the STEEPEST point of the curve, i.e., b = 50%; -> Point in the curve where the increases in Y are the highest. To find it: find theta for difficulty -> this is the point where beta is the most elevated => The steeper the line, the closer it is to VERTICAL.

Item Discrimination is related to ______

Item difficulty

Item Discrimination tells us ________

at which levels of data the item is most likely to differentiate best => Discriminates levels of theta

Discrimination typically ranges between _____

.5 and 1.5

[Item discrimination] What does it mean when... => Steeper slopes => Smaller slopes

Highly discriminating items; Poorly discriminating items

Items would be most effective in measuring underlying trait at the level that correspond with _______.

item difficulty → Hard questions are more effective at measuring high levels of the trait.

Item Information Curve indicates _________

How well an item is working for EACH LEVEL of the trait. = How well an item differentiate among respondents who are at different levels of the latent variable = Item difficulty + Item discrimination

Item parameters determine the amount of information at what range of the latent trait. (1) What are the parameters (2) What info do they give

Item difficulty → Location on the latent trait where information is MAXIMIZED Item discrimination → HOW MUCH INFO an item provides

Test Information Curve indicates _______

How much information is the TEST measuring? How well this is test at measuring things with precision? => The relative precision of the test in measuring diff levels of the data.

When talking about Test Information Curve (TIC), we're talking about Validity or Reliability? Why?

We're talking about RELIABILITY (NOT VALIDITY) Bc it focuses on how precisely a test measures the latent trait ACROSS DIFF LEVELS OF THAT TRAIT. => *THE HIGHER THE CURVE, THE BETTER YOUR ASSESSMENT OF THE TRAIT (mountain)

The height of the TIC is (inversely) proportional to the ____________

Standard error of measurement (SEM) => TIC and SEM are inversely related. -> Relibility = how much test scores are free of measurement error

SEM is ____ in regions of latent trait continuum where test information is the _____.

lowest; highest

In IRT, SEM is different for different latent trait values; how is that different from CTT?

CTT: 1 score of reliability for entire set of items IRT: 1 item = 1 reliability coefficient; Measurement error is NOT equal across the entire range of data

How does IRT Help us Improve Psychological Tests? (4)

(1) IDENTIFY item characteristics (i.e., difficulty, discrimination) (2) CHOOSE items with higher discrimination covering the entire range of the latent continuum (3) INCREASE RELIABILITY with fewer items (3) COMPARE itemps across DIFF MEASURES of SAME CONSTRUCT + Compare group differences

IRT Applications (2)

(1) Improving existing measures (2) Detecting differential item functioning

Differential Item Functioning (DIF) examines ______

Whether scales and items function differently across different discrete groups. -> Occurs when groups (such as defined by gender, ethnicity, age, or education) have different probabilities of endorsing a given item (controlling for overall score)

Differential Item Functioning (DIF) occurs when _________________

individuals from diff groups who have EQUAL levels of the UNDERLYING TRAIT, have diff probabilities of endorsing or agreeing with an item.

DIF analysis helps determine if items are ____ by _____________.

fair; examining group differences in responses while controlling for the trait level