Week 3 - Reliability and Validity Flashcards
Classical Test Theory
Test scores are result of
- Factors that contribute to consistency
- Factors that contribute to inconsistency (characteristics of test takers, things that have nothing to do with attribute such as situation, environment)
X = T + e
X = obtained score T = true score e = errors of measurement
Sources of Error
- Item selection
- Test administration
- Test scoring
- Systematic measurement error
Domain-sampling mode
a way of thinking that sees the test as a representative sample of a large domain of possible items that could be included on the test
- considers the problem of using only a sample of items to represent a construct
- as test gets longer, should represent construct better, increase reliability
Inter-rater reliability
the extent to which different raters agree in their assessments
Method variance
the variability among scores that arises because of the form as distinct from the content of the test - the method of administering the test
Reliability
the consistency that a test will give the same result each time it is used to measure the same thing
Stability over time
the extent to which test scores remain stable when a test is administered on more than one occasion
Internal consistency
the extent to which a psychological test is homogeneous or heterogeneous
Social desirability bias
a form of method variance that arises when people respond to questions that place them in a favourable or unfavourable light
Test-Retest Stability
The same test administered to the same group twice at different points
- may not get identical scores due to practice effects, maturation, treatment effects or setting
Parallel or alternate forms of reliability
Two forms of same test developed, different items selected according to the same rules
Parallel - same distribution of scores (mean and variance equal)
Alternate - different distribution of scores (mean and variance may not be equal)
Both matched for content and difficulty
Split half method
Test is divided into halves that are compared
- useful in overcoming logistical difficulties of test-retest reliability
Measuring Internal consistency
Cronbach’s Alpha
Cronbach’s alpha - a generalised reliability coefficient for scoring systems that are graded (i.e agree - disagree)
Acceptable levels of reliability
- .70-.80 acceptable or good
- greater than .91 may indicate redundancy
Standard error of measurement (SEM)
allows estimation of precision of an individual test score
- the larger the SEM, the less certain we are that the test score represents the true score
Reliability coefficient (r) - an index of the ratio of true score to error score variance in a test - SEM = (1-r)