Test Construction Flashcards
Validity
meaningfulness, usefulness, accuracy
either tells us how well the test is measuring what it’s supposed to be measuring (typically content or construct validity) or how well the test can be used to infer criterion performance (criterion-related validity)
content validity
no numerical validity coefficient
how adequately a test samples a particular content area
quantified by asking a panel of experts if each item is essential, useful/not essential, or not necessary
criterion-related validity
looks at how adequately a test score can be used to infer, predict, or estimate a criterion outcome (e.g. how well do SAT scores predict college GPA)
calculated using Pearson r to correlate test scores (predictor scores) with criterion scores (outcome scores)
validities as low as .20 considered acceptable
2 types of criterion-related validity
concurrent validity and predictive validity
concurrent validity
predictor and criterion are measured at the same time and correlated at about the same time
predictive validity
delay between measurement of predictor and the criterion