Ch 4 Flashcards
Define Validity
A measurement of an assessment’s accuracy; how accurately an instrument evaluates the trait/variable it was designed to assess
Also the TRUTHFULNESS of the data
Alt definition: the degree to which evidence and theory support the interpretation of test score entailed by proposed uses of the test
–> to what extent are the inferences made from a test appropriate, meaningful and useful
True of False; instruments themselves are neither valid nor invalid
True; it is the APPLICATION of these instruments and how the results of the instruments are interpreted that can be validated
Describe the relationship between reliability and validity
- very related concepts
- Instrument MUST be reliable to be valid
- Instrument can be reliable without being valid
- Reliability is a necessary but not sufficient condition of validity
Unified Validity Theory
Validity is essentially about the constrcut being assessed and the meaning of the scores obtained to measure it
Components:
Content: Is the instrument relevant and representative of the construct being measured?
Substantive: What is the theoretical rationale for the consistency of the response set?
Structural: is internal structure of instrument consistent with internal structure of construct?
Generalizablity: Can the info gathered be generalized to other populations?
External: What is the convergent and discriminatory evidence of validity?
Consequential: What are the actual/potential consequences of using this instrument?
What are the three types of validity?
- Content Validity
- Criterion-Related Validity
- Construct Validity
Content Validity
Ability of instrument to FULLY asses a construct - is the content and composition of a tool appropriate to measure what you what to measure?
Do the items on a career test actually cover al the aspectss of different careers?
SAMPLING ISSUE - how do you chose what should be on the test and what shouldn’t?
Evaluating content validity = very imprecise
What are the steps to evaluating content validity?
Evaluating content validity = very imprecise
1) Make sure well-established construct is being used: narrowly define the construct
2) Identify subcategories of the construct (review research, other instruments, etc) –> number of items in each domain should be proportionate to the importance of that domain in the overall construct
3) Can use one of two approaches:
a) Panel of experts: simplest approach; use field experts and lay experts
b) Use the content validity ratio (scientific approach)
Explain the content validity ratio
Ratio of:
number of raters who evaluate an item TO
the number of raters who deem item to be an important component of the construct being measured
(each rater deems item essential, useful but not essential, or not necessary)
Negative = not essential, Positive = could be essential
Face Validity
Whether an instrument LOOKS like it measure what is is meant to measure
- no empirical way to measure this, therefore no longer deemed a legit form of validity evidence
Criterion-Related Validity
Reflects ABILITY; how do the tests scores compare to a known standard or measurement
e.g. how does the SAT predict college success
This approach more appropriately demonstrated empiriral validity (because it realies on qunatitative data) than theoretical validity
Criterion
A standard or outcome measure
Usually a score on a seperate test that tries to measure the same construct/set of abilities as the test question
What are the two ways to get evidence for establishing criterion-related validity?
- Concurrent Validity
- Predictive Validity
Only need to use one to establish criterion validity; choose based on purpose of test and decisions that need to be made from results
Concurrent Validity
to establish criterion-related validity
When test score and criterion performance measure happens as the same time
e.g.during intake interview; assesses signs and symptoms and compares to DSM to make a diagnosis
Predictive Validity
to establish criterion-related validity
When criterion will take place at some point in the future
What are the characteristics of a sound criterion? (4)
- Relevant/useful - the rationale for choosing the criterion should be explicit
- Reliable - There should not be large unsystematic error in criterion because this make it impossible for research to compare test score to it
- Freedom from bias - criterion should be an objective measure of skill or ability
- Immune from contamination (contamination = when previous knowledge influences the gathering of criterion data)
Construct Validity
Extent to which a test can accurately/thoroughly measure a particular construct
Whether an instrument is measuring the CORRECT construct??
*most difficult to find validity evidence for
Define Construct
Abstraction that cannot be directly seen but helps to organize a ton of observations in the real world
e.g. intelligence
Test Homogeniety
The dergree to which all items on a test appear to measure the same construct.
Increased test homogenity = increased validity
How is test homogenity measured?
Item analysis (many ways to do this)
- check internal consistency (high internal consistency = high test homogenity)
- Item discimriniation: How well does each individual score of the test correlated with the overall score of the test?
What are the ways to measure construct validity?
- Test homogenity
- Item Discrimination
- Convergent Validity
- Discriminant Validity
- Group Differentiation studies
- Factor analysis
Define Convergent Validity
Comparing scores on a test to other tests that measure the same construct
If someone was depressed they would score high on the Beck AND my test –> my test has convergent validity
Discriminatory Validity
When test scores of my test do NOT correlated with another test that measures an alternate construct
If someone was depressed they should score high on my test and low on a scale for happiness
How does group differenetiation studies measure construct validity?
Adminsitering test to normal and pathological group –> test results of each shouldn’t look like each other
Could also adminster the to see how developmental changes impact the group (i.e. as group matures, scores should change because abilities should become better) (although this would not be enough to establish constrcut validity)
How does Factor analysis help establish construct validity?
Factor analysis: stat. technique used to mathematically group items together –> indicated the similarity of items (how well do they capture a specific construct)
Goal of factor analysis: is test unidimensional or multidimensional? (all one construct or mulstiple constructs)
Can use:
Exploratory factor analysis - researcher doesn’t know how items on a test relate to each other
Confirmatory factor analysis - when factor strucure is known and researcher is trying to see if it hold for a particular population
What does a validity coeffecient indicate? How do you interpret them?
A measure of how accurate a measure actually is
- Make sure it is stasticatically significant (.01 or .05)
(measure that a relationship between test score and the criterion has been established)
Very High: .50
High: .40-.49
Moderate: .21 - .39
Low: Less than .20
EXPECTED To be lower that reliability coefficients because they are more difficult to measure