L3: Validity Flashcards
define validity (casual & apa definition)
casual: the degree to which a psych test measures what it purports to measure
APA: the degree to which test scores are interpreted and used in ways consistent with empirical evidence and theory
aka construct validity
what does validity depend on?
how a researcher interprets the test scores
ex: Raven test of logical reasoning is invalid measure of neuroticism since theres no theory or evidence for such an interpretation
What are the types of evidence you can evaluate to see how valid a test is?
- test content (content validity)
- response process
- internal structure of the test
- associations w other variables
- consequences of use
define content validity
the degree to which the content of a measure truly reflects the full domain of the construct for which it is being used -> expert judgment
define face validity & how it differs from content validity
validity in the eyes of the test user
different than content validity cause content validity is about expert judgment
what do you need to watch out for in content validity?
- construct underrepresentation (ex: a personality test without (or just a few) neuroticism questions)
- construct irrelevant content (ex: a personality test w questions about mood)
How does the response process indicate validity? w example for desired process for self report & ability/achievemen questions
for the test score to have a valid interpretation, respondents should use the intended psychological response process to answer the items
ex:
- self report questions: “if i leave the house, i often double check if i took my keys w me” 1 never 2 hardly 3 sometimes 4 often 5 always: the desired response process is often based on memory retrieval (reading the item -> memory retrieval -> matching w response options -> respond)
- ability / achievement questions: desired reponse process depends on the (cognitive) ability (ex: read item -> logical reasoning -> match w response options -> respond)
How can you find out if the respondent used the desired process?
- direct evidence: think out loud protocols (respondents have to say whatever theyre thinking out loud), interview respondents
- indirect evidence: process data (response times, mous movement, eye movement), stat analysis of responses (like item total correlations, reliability), experimentally manipulating the response process
what are some threats to the response process validity?
- poorly designed items (misinterpretation of item, unintended correct solution, multiple correct solutions etc)
- respondent reasons (lack of motivation, social desirability, guessing etc)
how does the internal structure of the test affect validity?
does the theoretical structure comply to the structure that you find in practice?
every psych test has a theoretical internal structure (unidimensional if ur test only measures a single construct, multidimensional if ur test measures multiple constructs)
how can you see if the internal structure of the test is valid?
factor analysis should show:
- nr of factors match the theoretical structure of the construct
- rotated factor loadings display the theoretical structure (the right items correlate with the right factor)
- correlations between the factors should be as expected based on theory (so if its multidimensional but both factors are moderately correlated, this should be refelcted)
how does the test’ association with other variables affect validity?
key question: do the test scores relate to other tests and variables in a theoretically meaningful way
ex: if u invent a new scale for weight, then u should see if the weight calculated correlates moderately w length (as would be theoretically expected)
check w nomological network, criterion validity, concurrent validity, predictive validity
what are the 4 types of validity evidence based on the relationships between the test score &other variables?
- convergent evidence: when test scores correlate highly w other measures of the same construct
- discriminant evidence: when test scores do not correlate w measures of unrelated constructs
- criterion evidence: when test scores correlate w specific outcomes or behaviours (predictive or concurrent)
predictive evidence: when the test scores predict future performance or outcomes
concurrent evidence: when test scores correlate w other measures taken at the same time
what is a nomological network? what is it used for?
summarizes all theoretical relations between the construct of interest and other constructs and variables
what is a nomological network used for?
used when establishing validity of a new test, you look at the correlations/relations between the construct you want to study and other things in the network. the correlations shown are from literature & theory
each of the other constructs in nomological network can be operationalized in another test (ex: construct of depression via items of depression test)
apply these tests to a sample of subjects and see how well the actual results fit to the theoretical relations (based on that u can see how valid your test is) (convergent & discriminant evidence used for this)
- can also include observed variables next to the constructs (age, high school grades, educational attainment etc) and their relations to the constructs (like intelligence)
what is discriminant vs convergent evidence in a nomological network?
convergent evidence is when the observed positive correlations between 2 constructs (meaning when youve applied them to a sample of subjects) fit the theoretical positive correlation
discriminant evidence is when the constructs are unrelated in practice (when applied to a sample) and also in theory
-> both are good for validity
define validity coefficients
individual correlations between operationalized constructs (correlation between IQ test and critical thinking test)
define validity generalization
establishing whether the pattern of correlations found in nomological network holds also for other tests (different type of IQ test, different critical thinking test etc)
how do you quantify construct validity?
how well do the predicted and observed correlations match
define criterion validity
when u include observed variables (like age, high school grades etc) in the nomological network next to the constructs
you can then calculated the CV: association between the construct and an observed variable it should theoretically be related to
define concurrent validity
type of criterion validity
the association between the construct & an observed variable measured at the same time (ex: the correlation between an intelligence test & age)
define predictive validity
type of criterion valiidity
the association between the construct & an observed variable measured in the future (ex: the correlation between a primary school education test & salary in first job)
what are the 4 methods for evaluating convergent & discriminant validity?
- focused associations: look at the correlations between a test score & a few key variables (high correlations w related constructs indicates good convergent validity; low correlations w unrelated constructs indicate good discriminant validity)
- sets of correlations: here u evaluate a broader set of correlations to provide compehensive pic a tests convergent & discriminant validity
- multitrait multimethod matrices: correlations of same trait measured by different methods (convergent validity) should be high, correlations of different traits measured by same or different methods (disciminant validity), should be low
- quantifying construct validity: stat techniques using the multitrait multimethod data
what are multitrait multimethod matrices?
helps us interpret validity coefficient
ex: you find the correlation of 2 self report social skill test = .62
-> correlation may be due to trait variance (shared variance in the test scores due to same trait)
-> correlation may be due to method variance (shared variance in the test scores due to same method)
shows all methods & all constructs and their correlations