Validity (Convergent/ Discriminant Validity) Flashcards
Associative Validity
➢ Scores on a test are actually correlated (consistent) with other related measures
❖ And have no association (inconsistent) with other unrelated measures
➢ Example:
❖ We give a sample of people a measure of self-esteem…..
but also measures of happiness, life satisfaction, sadness, and depression
o We’d expect scores of self-esteem to be positively correlated with happiness & life satisfaction
o We’d also expect self-esteem to be negatively (or uncorrelated) with sadness & depression
➢ Convergent Validity:
❖ Scores are correlated (positively or negatively) with other measures they should be
o e.g., Scores of self-esteem positively correlated with scores of life satisfaction
➢ Discriminant Validity:
❖ Scores are uncorrelated (weakly correlated) with measures they should not be
o e.g., Scores of academic engagement should not be correlated with sports motivation
Convergent Validity
➢ Degree to which test scores correlate with measures of related constructs
❖ Test scores are correlated positively with other measures they should be
o e.g., self-esteem positively correlated with life satisfaction
❖ Test scores are also correlated negatively with other measures they should be
o e.g., self-esteem negatively correlated with depression
➢ If not the case = we may question the validity of our test scores
➢ Convergent validity can be evident between tests measuring the same construct
❖ One measure of self-esteem should positively correlate with another self-esteem measure
❖ One measure of anxiety should positively correlate with another anxiety measure
➢ So convergent evidence can be indicated by:
❖ Two measures that assess different but related constructs
❖ Two different measures that assess the same construct
Discriminant Validity
➢ Degree to which scores are uncorrelated with measures of unrelated constructs
❖ e.g., Scores of reaction time should be uncorrelated with social support
❖ e.g., Scores of parental attachment should be uncorrelated with intelligence
➢ If not the case = we may question what concepts we have actually measured
➢ Discriminant validity can be quite subtle (harder to determine)
❖ New measures can be created for psychological attributes….
but which are very similar to other concepts
➢ Example: Self-esteem and perceived friendships at school
❖ We may find a strong correlation between self-esteem and school friendships….
therefore, conclude that those high in self-esteem have more friends at school
❖ However, the two measures actually tap into the same concept (i.e., social-esteem)
o Thus, they may not be discriminant measures of different concepts
o Worth checking if the scores are not correlated with other unrelated constructs
Further Associations Between Variables
(Criterion Validity)
➢ Associative validity is often checked when a new study is created
❖ As part of the validation process
➢ Closely linked to the idea of associative validity…
is that certain variables should be correlated with essential other variables
❖ That is, there are certain criterion variables that are important for certain constructs
➢ Criterion validity relates to:
❖ “Test scores being related to particularly important criterion variables”
o e.g., perceived academic ability should be correlated with academic attainment
➢ Criterion validity emerges in empirical studies & relates to the timing of scores
❖ Concurrent Validity (all measures at same timepoint)
❖ Predictive Validity (scores related to measures at future timepoints)
Concurrent Validity
➢ Concurrent Validity relates to:
❖ Test scores being correlated with relevant criterion variables measured at same time
➢ Both measures are taken at the same time (i.e., concurrently)
❖ Studies in which all measures are collected at one are called “cross-sectional”
❖ They allow a snapshot of one moment in time
➢ Example 1: Academic Ability
❖ Academic ability should positively correlate with academic achievement
o At the same point in time
➢ Example 2: Self-Esteem
❖ Self-esteem should positively correlate with life satisfaction
o At the same moment in time
➢ If they do not…we may question the validity of our test interpretations
Predictive Validity
➢ Predictive Validity relates to:
❖ Test scores being correlated with relevant criterion variables at future timepoints
➢ Here, the two measures are taken at different timepoints
❖ Studies including multiple timepoints are “longitudinal” (or “time-separated”) designs
❖ They explore how one variable may predict another variable in the future
➢ Example 1: Academic Ability
❖ Academic ability at the start of a year should predict future academic achievement
➢ Example 2: Self-Esteem
❖ Self-esteem at one timepoint should positively predict future life satisfaction
➢ If this is the case…the measures indicate a high degree of predictive validity
❖ Evidence for predictive validity often found in various studies using that measure
Predictive Validity (Sensitivity & Specificity)
➢ Some measures are used for screening tests for a specific condition/ disease
❖ i.e., they are used for the purpose of diagnosis
➢ These test scores are evaluated based on two factors:
❖ A reference standard of the target condition (i.e., does a person appear to have/ not have condition)
❖ The test score yielding a positive or negative result (for the target condition)
➢ The accuracy/ validity of a screening test can be checked via two metrics:
Sensitivity
❖ A test’s ability to detect an individual with a true positive for the condition/ disease
o Reflects a test’s ability to correctly identify people who have the condition
o Highly sensitive tests = less false negative results (i.e., fewer cases with condition are missed)
Specificity
❖ A test’s ability to detect an individual with a true negative for the condition/ disease
o Reflects a test’s ability to correctly identify people who do not have the condition
o Highly specific tests = less false positive results (i.e., fewer wrong diagnoses of people without condition)
Criterion Validity (Summary)
➢ Criterion, concurrent, and predictive validity are all types of convergent validity
❖ They rely on tests scores aligning with scores of other related variables
➢ Criterion validity relates to how scores relate to specific important variables
❖ Sometimes more important than the theoretical meaning of test scores
Example: Job Employer
➢ An employer may use a test to assess how likely an applicant may be to perform well
o They are not really concerned with associations between variables
o Their “criterion” is to identify high and low performers (on essential skills/ abilities)
➢ Hence, criterion validity may look different depending on the meaning of a test
❖ It may mean how test scores predict another variable
❖ It may mean differentiating people
Convergent & Discriminant Evidence
➢ Many scales may be developed displaying good content & structural validity
❖ They seem to include all the essential dimensions
❖ These dimensions emerge in the data
➢ Convergent & discriminant validity assess:
❖ If a measure shows the ‘correct pattern’ of associations with other measures
➢ However, there may be a few remaining questions:
❖ How do we know if we have tapped into every aspect of psychological construct?
❖ How do we know which other measures to compare to?
❖ How do we know what convergent/ discriminant associations to expect?
➢ To answer these concerns…
❖ Test developers will consider a construct’s nomological network
A Construct’s Nomological Network
➢ A ‘network of meaning’ surrounding a construct
(Cronbach & Meehl, 1955)
➢ Situating a construct in the context of…
other constructs, behaviours, & properties
❖ Sharpens the meaning of the construct we wish to assess
➢ Consider what other constructs:
❖ Should be related to the target construct…
❖ What relationships would we expect
➢ This process helps identify what tests to conduct
❖ Dictates a possible pattern of associations
❖ Makes us think more critically about how construct
Methods To Evaluate
Convergent/ Discriminant Validity
➢ There are 4 possible ways to test for convergent & discriminant validity:
1. Focused Associations
❖ Associations between test scores and particularly important criterion variables
2. Sets of Correlations
❖ Associations between test scores and a wide range of relevant variables
3. Multitrait–Multimethod Matrix
❖ Associations between multiple constructs (traits) measured using multiple methods
4. Quantifying Construct Validity
❖ Making a specific prediction of the convergent/ discriminant pattern
❖ Then testing the actual pattern of these associations to see if they align
- Focused Associations
➢ Some measures have clear relevance with other essential variables
❖ SAT scores should reflect cognitive processes that students need for college
o SAT scores should relate to measures of cognitive processes & college attainment
➢ Evaluate convergent validity by focusing on these specific associations
❖ “Make-or-break” indicators
❖ If they have no association, then this raises doubt over the validity
➢ This method focuses on testing correlations with very select variables
❖ These correlation values are sometimes referred to as validity coefficients
❖ A strong correlation = higher validity
➢ Studies may explore if these correlations emerge across multiple data sets
❖ Known as Validity Generalization
❖ Reveals the general level of convergent validity across various studies/ small samples
- Sets Of Correlations
➢ Some constructs may be linked to a wide variety of relatable variables
❖ They have a wide nomological network
➢ Researchers must therefore examine wide range of criterion variables
➢ Researchers will compute a large number of correlation between different variables
❖ Then “eyeball” all the various correlations
❖ Make subjective judgements based the associations
➢ This approach is commonly used
❖ Make conclusions based on patterns that generally make sense
❖ As long as they align with the theory and existing evidence
- Multitrait–Multimethod Matrix
(Campbell & Fiske, 1959)
➢ A statistical method to evaluate convergent and discriminant validity
❖ In the form a Multitrait-Multimethod Matrix (MTMMM)
➢ Involves obtaining scores for several (connected) traits using multiple methods
Example: Creating a self-report measure of social skills….
❖ Then administer self-report measures for impulsivity, conscientiousness, & emotional stability
o Also ask the parents to complete measures on these variables (about the participant)
o Also asked friends to complete measures of these variables (about the participant)
➢ Responses to multiple traits (social skills, impulsivity, conscientiousness, emotional stability)
❖ Via multiple methods (e.g., self-report, parents-report, peer-report)
➢ So therefore 4 trait/ 3 method MTMMM
❖ So, 12 different measurements
- Multitrait–Multimethod Matrix
➢ MTMMM helps set guidelines for evaluating convergent/ discriminant validity
❖ Evaluates whether different sources of variance may affect correlations between traits
➢ Allows us to explore two key aspects:
❖ Trait Variance
o Are scores on two traits actually associated/ correlated with one another
❖ [Shared] Method Variance
o Do correlations between traits only exist when they are measured in the same way
➢ Is there a genuine association or are differences due to differences in method
➢ Example
❖ Self-reported social skills may correlate with self-reported emotional stability
❖ Self-reported social skills may not correlate with parent-reported emotional stability
o People who rate themselves as more social may rate themselves for emotionally stable
o Yet parents don’t see them as emotionally stable
- Multitrait–Multimethod Matrix
MTMMM allows us to find is 4 different types of correlations:
➢ Correlations between different traits measured by different methods
❖ Heterotrait-Heteromethod Correlations
➢ Correlations between different traits measured by the same method
❖ Heterotrait-Monomethod Correlations
➢ Correlations between same trait measured by different methods
❖ Monotrait-Heteromethod Correlations
➢ Correlations between the same trait by the same method (same score)
❖ Monotrait-Monomethod Correlations
- Multitrait–Multimethod Matrix
➢ There are three aspects to determine convergent/ discriminant validity:
❖ Are scores for the same trait correlated regardless of the method
1. Positive correlations between different methods measuring the same trait
❖ Scores for the same trait should be similar regardless of the method
o Self-reported social skills should correlate with both parent and peer-reported social skills
2. The same trait should correlate with other methods measuring that trait
❖ Compared to any other trait
o Self-reported social skills should correlate with parent and peer-reported social skills
(more than any other trait)
3. The same trait should correlate with other methods measuring that trait
❖ Compared to any other trait measured in the same way
o Self-reported social skills should be correlated with parent and peer-reported social skills
(more than any other self-reported trait)
- Quantifying Construct Validity
➢ An issue raised with the three previous methods is they are quite subjective
❖ Do we just select a series of variables and explore the correlations?
❖ MTMMM relies on noticing of correlations are “stronger” or “weaker”
➢ To help enhance the precision of convergent & discriminant validity
❖ Quantifying Construct Validity (QCV) was created (Western & Rosenthal, 2003)
➢ QVC relates to a PREDICTED pattern of convergent/ discriminant correlations
❖ & the ACTUAL pattern of these correlations
➢ Aiming to formally quantify the degree of “fit” between:
❖ Requires a precise theoretical prediction for convergent/ discriminant correlations
❖ Then compute the actual correlations obtained
- Quantifying Construct Validity
➢ Provides more precise & objective estimates of correlations
➢ QVC includes three specific steps:
1. Consider what criterion variables to include and generate predictions
❖ These predictions are very PRECISE
❖ Social motivation will positively correlate with a need to belong (r = .64)
2. Then collect data and compute the actual correlations
❖ So gain information on the actual associations
3. Compute the degree that the actual correlations fit the predicted correlations
❖ Calculating a correlation between the predicted and actual correlations
❖ This correlation reflects a validity coefficient
o A high correlation indicates a high degree of validity
o A poor fit indicates a low degree of validity
- Quantifying Construct Validity
➢ Can graphically plot to see how the prediction and actual correlations align
➢ When calculating the correlations to assess the alignment, also includes:
❖ Effect sizes
o How strong they correlate?
❖ Significance test
o Is the association meaningful?
Associations Between Variables
(Summary)
➢ There is no perfect method
❖ Different constructs/ traits will never perfectly align
❖ If they did, we would question whether they are measuring the same thing
➢ Benefits of QCV:
❖ Requires carefully consideration the pattern of associations (based on theory)
❖ It is useful to make precise predictions before running correlations
o What is expected? Is anything unexpected?
❖ It retains the focus upon the target construct (the measure being evaluated)
❖ It provides a single correlation (validity coefficient)
o Reflects the degree to that predictions match the actual correlations
Consequences Of Use
➢ The consequences of a test can also have implications for validity
➢ Test developers have:
“…an obligation to review whether a practice has appropriate consequences for individuals
and institutions and especially to guard against adverse consequences”
Cronbach (1988)
➢ Consequential Validity refers to the intended & unintended consequences of test use
❖ What are the outlined consequences?
❖ Are there any unforeseen impacts of a test?
➢ Examples:
❖ If a test (or construct) inadvertently benefits more males than females
❖ If a test enables one group to gain higher scores than another
Consequential Validity
➢ The match between INTENDED consequences & ACTUAL consequences of a test
❖ Are there any unforeseen/ unfair (biased) consequences of a test’s use
➢ There is debate whether consequential validity should be included
❖ as part of the scientific evaluation of the meaning of tests scores
➢ This debate relates to two issues:
1. Test Score Interpretation
❖ Some believe only the psychological meaning of scores should be accounted for
❖ Others feel the consequences of a test are engrained with the validity
o But consequences for who? What criteria can be used?
2. Test Score Use
❖ What is the purpose of testing and interpreting scores?
❖ Should scientific validity be more important than social value?
o Many believe that science and personal/ social values cannot be separated
o Can test scores be valid if they potentially favour certain groups?
Evaluating Consequential Validity
Kane (2013) framework of consequential validity
➢ Evidence of intended effects
❖ The ‘benefits’ and ‘costs’ should be carefully assessed
❖ If costs outweigh the benefits, then we may question the validity of a test
o If a test is intended to help recruitment and does so then it meets its intended purpose
➢ Different (Adverse) Impact On Groups
❖ If decisions end up weighted towards certain groups = concerns of a test
❖ Is a test biased towards groups or are there genuine differences?
o If a test will make it less likely for men to be hired, then the test may not be valid
➢ Unintended Systematic Effects
❖ Does the use of a test unintentionally change an organisation?
❖ If the use of a test changes practice, then we may need to be cautious of the use of a test
o If teachers are assessed on student’s SATs, this may change their teaching to enhance SAT’s
o Medical students may only select subjects that they know will be covered in a medical exam