Chapter 5: Validity Flashcards
Validity
Appropriateness and accuracy of the interpretation of test scores
Can’t be measured in a single test (need a comparison basis)
Construct underrepresentation
Test doesn’t measure important aspects of the specified construct
Similar to content sampling error
Construct-irrelevant variance
Test measures features that are unrelated to the specified construct
External threats to validity
Examinee characteristics (ex- anxiety, which hinders examinee)
Deviation from standard test administration and scoring
Instruction and coaching
Standardization sample isn’t representative of population taking test
3 types of validity
Content validity
Criterion-related validity
Construct validity
Content validity
Degree to which the items on the test are representative of the behavior the test was designed to sample
How content validity is determined
Expert judges systematically review the test content
Evaluate item relevance and content coverage
Criterion-related validity
Degree to which the test is effective in estimating performance on an outcome measure
Criterion
Comparison basis for a test
Predictive validity
Time interval between test and criterion
Example: ACT and college performance
Concurrent validity
Test and criterion are measured at same time
Example: language test and GPA
Considerations in test-criterion studies
Selecting a criterion Criterion contamination (taking test changes later performance) Decision-theory models (circumstances surrounding test that need to be made aware of) Validity generalization (does the test actually predict things similar to the test criteria)
Sensitive test
Everyone of interest is identified, but lots of false positives
Specific test
Very accurate in identification, but lots of false negatives
Construct validity
Degree to which test measures what it is designed to measure
Determining convergent validity
Correlate test scores with tests of same or similar construct
Determining discriminant validity
Correlate test scores with tests of dissimilar construct
Multitrait-multimethod approach to determining construct validity
Use multiple measures for same constructs to check for convergence as well as measures for other constructs to check for divergence
Contrasted group study approach to determining construct validity
Create 2 separate and different groups: administer test and look for differences between them
Factor analysis
Used to determine if test is measuring factors related to the given construct
Assign factor loadings (similar to correlation coefficients): variables should have high loadings on only 1 factor
Evidence based on response processes
Is the manner of responses consistent with the construct being assessed?
Evidence based on consequences of testing
If the test is thought to result in benefits, are those benefits being achieved?
Incremental validity
Determines if the test provides a gain over another test
Face validity
Determines if the test appears to measure what it is designed to measure
Not a true form of validity
Problem with tests high in these: can fake them
Internal vs. external validity
Internal: Does the measure work in ideal conditions?
External: Does it work in the real world?