Applied Reliability Flashcards
Why does reliability matter?
Implications for applied behavioral science, behavioral research and test construction and refinement
How does reliability relate to applied behavioral science?
psychological tests can inform important decisions; reliability reflects the precision with which an individual’s test score represents her/his true score; what if a person’s test score is a poor reflection of her/his true psychological attribute?
What are the two pieces of information about an individual’s test score?
- Point estimate (the best guess of the individual’s true score
- Confidence Interval (a range of scores in which the individual’s true score likely falls; reflects the precision of the estimate; more reliability=bigger CI score)
Why should test users care about reliability?
Reliability has implications for the precision of estimates of individual scores. Therefore 1. Test users should (attempt to) use highly reliable tests 2. Test users and test takers should pay attention to reliability when interpreting scores
What is the goal of behavioral research?
detect and understand associations between variables (e.g., therapist mindfulness and therapy outcomes); with at least one variable in a study being a psychological attribute of some kind
Does correlation between two sets of test scores reflect the true association between psychological attributes? No, you never truly know.
What are the four points about the relationship between reliability and behavioral research?
- There’s a precise link between true correlations, reliability, and observed correlations
- Measurement error attenuates observed correlations
- These points apply to all effect sizes, not just correlation
- Next—reliability affects the likelihood of obtaining results that are statistically significant
Why should researchers care about reliability? What should they do with regard to reliability?
Reliability determines how closely researchers observed effects represent true psychological effects; therefore, researchers should
- Try to use highly reliable measures
- Estimate and consider reliability when interpreting their (or other’s) results
How can you enhance reliability? What should you consider when selecting items?
enhance reliability by using more items (test length) and identifying/selecting items that are consistent with each other (internal consistency). Issues to consider when selecting items 1. Item discrimination 2. Item variability 3. Item mean (item difficulty)
What is item discrimination and consistency? How is it identified?
generate set of items (or use existing test items); collect responses to items; select items that are consistent with the rest of the test, in terms of discriminating high and low scorers; identified via 1. Inter-item correlation 2. Item-total correlations 3. Item discrimination index (for binary items) 4. Alpha if item deleted
How might item variability impact the goal of psychological measurement?
Recall, purpose of measurement is to detect psychological differences; items with no/limited variability (in terms of responses by test-takers) may be poor at detecting differences; additionally, correlations among items depend upon variability; items without variability cannot correlate with other items (co-variability depends on variability)
What do ceiling or floor effects relate to?
High or low mean = ceiling or floor
An item’s mean might be tied to its variability (ceiling/floor effects)
What is discrimination index? How do you calculate?
compare highest scoring to lowest scoring and compare
DI=(proportion correct high score) – (proportion correct low score) / total number of people in each of those percentiles
#correct/N - #correct/N OR (#correct high - #correct low) / N
What is a good score for discrimination and difficulty index?
want between a 0.4 and 0.7