Comps - Assessment (Validity) Flashcards
Validity
The degree to which evidence and theory support the interpretations of test scores entailed by the proposed uses of a test
If validity is not a property of a test, what is it?
It is a property of how the clinician uses the test
Tests are not valid in and of itself; __________ made about test scores and use of test scores are more or less valid
Inferences
Content Validity
Examining the match between the content of the test and the content that should be included on a test of that particular attribute (extent to which questions on a test represent the construct being measured)
What is a good question to conceptualize content validity?
How well does the content match what should be on the test?
What do you want to avoid if you want a high content-valid test?
Construct-irrelevant content (items that are not related to the content being measured)
Threats to Content Validity
-Inaccurate knowledge
-Construct underrepresentation (something important about the content is missing)
Face Validity
The extent to which the items look like they are measuring the construct
Content Validity Example
If a therapist is using a questionnaire to assess depression, the questionnaire should include questions that cover all the symptoms of depression, such as mood, energy levels, sleep patterns, and appetite. It ensures that the test comprehensively covers all relevant aspects of depression
Criterion Validity
Correlation between a test score and some external criterion (e.g., a validated test/measure) – compares it against some other measure or outcome already considered to be valid
Forms of Criterion Validity
-Convergent Validity
-Discriminant Validity
-Concurrent Validity
-Predictive Validity
Convergent Validity
This checks if the test correlates well with other tests that measure the same or similar constructs
Discriminant Validity
This checks if the test does not correlate too strongly with tests that measure different constructs
Concurrent Validity
This checks if the test correlates well with a criterion measure taken at the same time
Predictive Validity
This checks if the test can predict future outcomes or behaviors
Convergent Validity Example
A new depression questionnaire (Designer Depression Inventory) should have similar results compared to an established depression test (Beck Depression Inventory)
Discriminant Validity Example
The depression questionnaire should not have high correlations with a test measuring something different, like delusions
Concurrent Validity Example
If a therapist gives a new depression test (Designers Depression Inventory) and the established depression test (Beck Depression Inventory) at the same time, they should both show similar results
Predictive Validity Example
If a therapist uses an assessment to predict how well a client will respond to a specific therapy in the future, the results should match the actual outcomes
Factors impacting the size of the reliability coefficient
-Restriction of range issues
-Reliability of predictor and criterion measures
-Sample size
-Screening people into dichotomous categories
-Sensitivity and specificity
Restriction of Range Issues
Not representative of a larger population (e.g., if I am looking at depression in college students, it will not be representative of an adult population)
Reliability of Predictor and Criterion Measures
As the reliability for either the predictor or the criterion gets lower, the coefficient is less likely to be significant
Spearman’s Correction for Attenuation
-Correcting for measurement error to strengthen your predictor
-Takes out the effective measurement error for the coefficient
-Reduces the weakening effect of measurement error and specifies the actual correlation between the constructs if they were both perfect and unaffected by error
Sample Size
As the sample size increases, the reliability coefficient needs to be smaller in order to be considered significant (you want it to be 0 to account for as little error as possible)
Screening People into Dichotomous Categories
Involves setting a cut score between two categories
Sensitivity
Proportion of actual positives; appropriately identifying people with the construct (low type II error – miss)
Specificity
Proportion of true negatives; appropriately identifying people without the construct (low type I error – false positive)
Construct Validity
How well a test measures the theoretical construct it is supposed to measure. This involves ensuring that the test truly reflects the concept in question and that the concept itself is well-defined. Subsumes both content and criterion-related validity
Construct Validity Example
If a therapist is using a test to measure self-esteem, the test should measure self-esteem and not some other related concept like confidence or social skills
Messick’s Unified Theory of Construct Validity (MUTCV)
Six aspects to construct validity
MUTCV Aspect 1: Content Validity
-Content of a test
-Ex: content of Designers Depression Inventory represents all the facets of depression
MUTCV Aspect 2: Criterion Validity
-Relationship to external factors
-Ex. when the Designers Depression Inventory is compared to the BDI, clients should receive similar scores
MUTCV Aspect 3: Psychological Processes
-How well do processes that respondents actually use match the processes they should theoretically be using?
-Ex 1. does a processing speed test require processing speed skills/abilities?
-Ex 2. when taking the Designers Depression Inventory, they should naturally have negative affectivity and pessimistic traits that align with depression symptoms
MUTCV Aspect 4: Internal Structure of a test
-How parts of the test are related
-Factor analysis (evaluates the internal structure/dimensionality of the test; the goal is to find the smallest number of meaningful underlying variables/factors that account for the pattern of relationships among responses)
-Ex. Designer Depression Inventory – when testing for depression, wanting to see if mood, cognitive, and behavioral overlap within the inventory or if they are separate
Exploratory Factor Analysis
Wanting to find which factors overlap
Confirmatory Factor Analysis
Wanting to confirm that the factors do overlap
MUTCV Aspect 5: Generalizability
-Extent to which score properties and interpretations generalize across population groups/settings/tasks
-Ex. Designer Depression Inventory – testing on IUP students, would want to make sure an urban/diverse population has similar results
MUTCV Aspect 6: Consequences of Test Use
-Validity includes unintended and intended consequences of test use
-Does the test unfairly or adversely impact some people more than others?
-Ex. Designer Depression Inventory – testing on a Western population with a focus on emotional focus on depressive symptoms, but in Latin populations they have more of a focus on physiological symptoms of depression
Type I Error
False positives
Type II Error
False negatives