Test Construction Flashcards
T-score distributions have a mean of _____ and a standard deviation of _____.
Mean of 50
SD of 10
Ex. A score of 62 on the MMPI is 12 T-score points above the mean (50) - so it is 1.2 standard deviations above the mean.
(10 x 1.2 = 12)
Reliability coefficient range
0.0 - 1.0
- 0 = completely unreliable
- 0 = perfectly reliable
How do you interpret a reliability coefficient?
Directly
Kuder-Richardson Formula (KR-20)
Used when test items are dichotomously scored
Right/wrong, yes/no
What method for establishing reliability is considered to be the best?
Alternate forms
Alternate forms reliability
Administering two equivalent forms of a test to the same group of examinees and obtaining a correlation between the two sets or scores
Why do some experts consider an alternate forms coefficient to be superior?
It would have had to be consistent across time and different content.
A kappa coefficient is used to evaluate what?
Inter-rater reliability
A kappa coefficient in the lower .90s indicates what?
High reliability
Speed vs. power test
Speed test measures an examinee’s response rate
Power test measures the level of difficulty a person can reach (items usually arranged in increasing difficulty)
Maximum vs. typical performance
Maximum: what a person is capable of achieving (WJ)
Typical: what an examinee usually does (personality test)
What is a source of measurement error for the test-retest coefficient?
Time sampling
If there have been changes in exam conditions from one administration to the next that impact different examinees in different ways, what has occurred?
Time sampling
Which type of coefficient tends to be lower despite being preferred: alternate forms or test-retest?
Alternate forms
Internal consistency
Obtaining correlations among individual items in a test
3 methods for determining internal consistency
- Split-half reliability
- Spearman-Brown formula
- Kuder-Richardson Formula (KR-20)
T/F: The standard error of measurement indicates how much error an individual test score can be expected to have
True
The standard error of measurement is used to construct what?
A confidence interval
When a test measures the knowledge of the content domain it was designed to measure, we can say the test has what?
Content validity
We would say that the SAT has what type of validity if it can accurately predict an examinee’s performance in college?
Criterion-related validity
A test has what type of validity if it can accurately measure a theoretical, non-observation construct or trait?
Construct validity
Convergent/Divergent Validity and a Factor Analysis are associated with which type of validity?
Construct validity
Convergent vs. Divergent Validity
Convergent: the test has a high correlation with another test that measures the SAME construct
Divergent: the test has a low correlation with a test that measures a different construct
The more similar a group is may result in an increase or decrease in reliability?
Decrease
Reliability will increase with heterogeneity
A multitrait-multimethod matrix assesses what?
Convergent and divergent validity
Divergent validity may also be called what?
Discriminant Validity