Item Analysis and Test Reliability Flashcards
Three factors that affect the reliability coefficient
- Content homogeneity
- Range of scores
- Guessing (T/F tests tend to have lower validity, due to easier guessing)
What are the four main methods of assessing a test’s reliability?
- Test-retest
- Alternate Forms
- Internal Consistency
- Inter-rater
_____ is due to random factors that affect the test performance of examinees in unpredictable ways and include distractions during testing, ambiguously worded test items, and examinee fatigue.
Measurement error
_____ is the result of actual differences among examinees with regard to whatever the test is measuring. It’s assumed to be consistent, which means that an examinee’s true score will be the same regardless of which form of the test he or she takes or who scores the test.
True score variability
_____ provides information about the consistency of scores over time.
Test-retest reliability
_____ provides information about the consistency of scores over different forms of the test and, when the second form is administered at a later time, the consistency of scores over time.
Alternate forms reliability
provides information on the _____ consistency of scores over different test items and is useful for tests that are designed to measure a single content domain or aspect of behavior. It’s not useful for speed tests because it tends to overestimate their reliability.
Internal consistency reliability
What are three methods of evaluating a test’s internal consistency reliability?
- Coefficient alpha (Cronbach’s alpha)
- Kuder-Richardson 20 (KR-20)
- Split-half reliability
This method of evaluating internal consistency reliability involves administering the test to a sample of examinees and calculating the average inter-item consistency.
Coefficient alpha (Cronbach’s Alpha)
This method of evaluating internal consistency reliability is an alternative to coefficient alpha that can be used when test items are dichotomously scored (e.g., as correct or incorrect).
Kuder-Richardson 20 (KR-20)
This method of evaluating internal consistency reliability involves administering the test to a sample of examinees, splitting the test in half (often in terms of even- and odd-numbered items), and correlating the scores on the two halves.
Split-half reliability
Given split-half reliability coefficients tend to underestimate a test’s reliability, due to using two shorter forms, they are usually corrected with the _____, which is used to determine the effects of lengthening or shortening a test on its reliability coefficient.
Spearman-Brown prophecy formula
_____ provides information on the consistency of scores over different raters and is important for tests that are subjectively scored.
Inter-rater reliability
What are two methods of evaluating inter-rater reliability?
- Cohen’s kappa coefficient -
used to assess the consistency of ratings assigned by two raters when ratings represent a nominal scale - Kendall’s coefficient of concordance - assess the consistency of ratings assigned by three or more raters when ratings represent ranks
This occurs when two or more raters communicate with each other while assigning ratings, which results in increased consistency (but often decreased accuracy) in ratings and an overestimate of inter-rater reliability.
Consensual observer drift