Psychometrics - Reliability + Coefficient Alpha Flashcards
What is reliability?
The desired consistency or reproducibility of test scores
4 assumptions of classical test theory
- Each person has a true score that we could obtain if it weren’t for measurement error
- There is measurement error, but it’s random
- The true score of a person doesn’t change upon repeating tests, even though the observed score does
- The distribution of random errors will be the same for all ages
Domain Sampling Model
The idea that we can’t construct a text that asks all possible questions within the domain being tested, so we have to select only certain ones. But using fewer items can lead to an introduction of error
Reliability analysis’s aim:
Establish how much error is made by using the score from the shorter test as an estimate of one’s true ability - error comes from multiple sources, and there are different ways of measuring reliability that are sensitive to different measurement errors
Four types of reliability
- Test-retest
- Parallel forms
- Internal consistency
- Inter-rater reliability
What kind of error is the test-retest method designed for?
Time sampling
How does test-retest work?
You give someone the same test at two different points in time, and assess how much of difference there is in performance from the first test taking to the second
Problems with test-retest reliability
Practice effect, testing effects, maturation, history
Also not ideal when you want to assess something that is expected to change over time
What source of error is parallel sampling designed to account for?
Item sampling
How does parallel forms reliability work?
You compose two different forms of the same test and get participants to do both
Problems with parallel forms of reliability
How do we give both tests without having time problems? You need a bigger item pool. Testing effects.
What error does internal consistency reliability account for?
The reliability of one test administered on one occasion
What does internal consistency measure and what three methods are used?
Do the different items within all measure the same thing to the same extent?
- Split half reliability
- Coefficient alpha
- KR-20
When do we use KR-20?
It’s used to find the alpha with dichotomous format measures
How does split half reliability work?
A test is split in half, assessed and then correlated to see if the test is consistent.