TM T3 Flashcards
Validity
The degree to which empirical evidence and theoretical rationales support an assessment conclusion.
Not something a test HAS; a justification
Criterion, construct, content
Construct validity
Based on conceptual variables underlying a test.
Content validity
Based on subject matter of a test
Avoid saying
“the validity of the test”
When is validity wrong?
Wrong population -> different groups require special tools
Wrong task -> using the wrong test can lead to invalid results
Wrong context -> wrong testing environment
Wrong context example
Using a personality test for a hiring decision
Face validity
the APPARENT soundness of a test or measure regardless if they are. Intuitive. Something can lack this but still be valid - that might even be BETTER, since you cannot tell what is being tested
5 Sources of Evidence of Validity
- Test content
- Response processes
- Internal structure
- Relations with other variables
- Consequences of testing
Content validity details
Evidence based on a test’s content
Extent to which a test measures subject matter or behavior under investigation
Example: test for 3rd grade math… how well does it represent what we want kids to know?
Validity based on response processes
How do people actually respond to a test - is it measuring what it’s supposed to measure?
“Can you repeat the question in your own words?”
“What, to you, is ___”
“How sure are you of your answer?”
Revise based on answers
Evidence based on internal structure
Does the test align with the theoretical framework or construct it’s intended to measure? Is the construct represented in the patterns of responses that people give?
Factor analysis
Tool used to see which variables correlate with eachother and if they cluster around one common factor: “energy loss, appetite changes, difficulty concentrating” might cluster around somatic traits in the BDI-II
Criterion validity
How well a test correlates with a specific standard
4 methods
Predictive, concurrent, retrospective and incremental
If a measure of criminal behavior is calid, we should be able to tell if:
1. They will be arrested in the future
2. The are currently breaking th elaw
3. They have a previous criminal record
2 ways of thinking of CONSTRUCT validity
Convergent - how does my test correlate to other similar tests?
Discriminant - can a test prove you’re measuring your construct and not something else?
Evidence based on consequences of testing
“Does this test produce situations for people that are not OK?”
If so -> not as strong as a test with no negative outcomes
Does this test promote fair outcomes
Operational definitions
How you’re going to try to measure something - as an external, observable behavior
Example: aggression
How many times did they hit somebody
Construct
Abstract idea
Can’t directly measure it
Characteristics or attributes
EX: aggression, intelligence
Demonstrating content validity
Define the test universe
Develop the test specifications
Establish the test format
Construct the test questions
Construct validity: defining the test universe
- What relevant research is there to help develop the constructs?
- Who are the key experts? Can they evaluate items? Also - cite them
- Main construct aspects/dimensions
- Other validated instruments
Construct validity: Test specifications, format, questions
- Specify content areas
- Construct test questions
- Purpose and intended use of test
- Format and length
- Et cetera
IVR example
Constructs: minimization, violence recognition, partner blaming, distal blaming
IVR lacks some
test specifications
Criterion-related validity
Correlation with establish CRITERION (standard of comparison) - how well it correlates with its established standard
Examples of criterion validity
SAT performance correlating to academic success
accidents on a job correlating to supervisor ratings