Chapter 6: validity Flashcards
a judgement or estimate of how well a test measures what it purports to measure in a particular context
validity
The process of gathering and evaluating evidence about
validity
validation
T or F: Both test developers and test users may play a role in the validation of a test
true
May yield insights regarding a particular population of test takers as compared to the norming sample described in a test manual
Local validation studies
three categories of validity
- content validity
- criterion-related validity
- construct validity
This measure of validity is based on an evaluation of the subjects, topics, or content covered by the items in the test
content validity
This measure of validity is obtained by evaluating the relationship of scores obtained on the test to scores on other tests or measures
criterion-related validity
This measure of validity is arrived at by executing a comprehensive analysis of:
- how scores on the test relate to other test scores and measures
- how scores on the test can be understood within some theoretical framework for
understanding the construct that the test was designed to measure
construct validity
A judgment concerning how relevant the test items appear to be
face validity
If a test appears to measure what it purports to measure “on the face of it,” it could be said to be __________
high in face validity
A judgment of how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample
content validity
A plan regarding the types of information to be covered by
the items, the number of items tapping each area of coverage, the organization of the items in the test, etc.
test blueprint
Culture and the relativity of content validity
- the content validity of a test varies across cultures and time
- political considerations may also play a role
measures
agreement among raters regarding how essential an individual test item is for inclusion in a test
content validity ratio
values range of content validity ratio
-1 to 1
what does it mean if content validity ratio is closer to +1
majority of experts agree there is an association between the item and the domain
the standard against which a test or a test score is evaluated
criterion
characteristic of an adequate criterion
- relevant to the matter at hand
- valid for the purpose for which it is being used
A judgment of how adequately a test score can be used to infer an individual’s most probable standing on some measure of interest
a. criterion-related validity
b. concurrent validity
c. predictive validity
a. criterion-related validity
An index of the degree to which a test score is related to some criterion measure obtained at the same time
a. criterion-related validity
b. concurrent validity
c. predictive validity
b. concurrent validity
An index of the degree to which a test score predicts some criterion, measure
a. criterion-related validity
b. concurrent validity
c. predictive validity
c. predictive validity
Statistical evidences for concurrent and predictive validity
- expectancy data ~ expectancy table/chart
- validity coefficient
A correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure
validity coefficient
The degree to which an additional predictor explains something about the criterion measure that is not explained by predictors already in use
incremental validity
Proportion of people a test accurately identifies a possessing/exhibiting a
particular trait, behavior/
characteristic/attribute
hit rate
Proportion of people the
test fails to identify as having/ not having a particular characteristic/
attribute
miss rate
percentage of people hired under the existing system for a particular position extent to which a particular trait, behavior, characteristic or attribute exists in the population expressed in proportion
base rate
numerical value that
reflects the relationship
between the number of
people to be hired and the number of people available to be hired
selection ration
a miss wherein the test
predicted that the examinee did possess the particular characteristic being measured when the examinee did not
false positive (type 1 error)
a miss wherein the test predicted that the examinee did not possess
the particular characteristic being measured when the examinee did
false negative (type 2 error)
Judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a construct
construct validity
T or F: If a test is a valid measure of a construct, then high scorers and low scorers should behave as theorized
true
evidence of construct validity
- homogeneity
- changes with age
- pretest-posttest changes
- from distinct groups
evidence of construct validity: How uniform a test is in measuring a single concept
evidence of homogeneity
evidence of construct validity: Some constructs are expected to change over time (e.g., reading rate)
evidence of changes with age
evidence of construct validity: Test scores change as a result of
some experience between a pretest and a posttest (e.g., therapy)
evidence of pretest-posttest changes
evidence of construct validity: Scores on a test vary in a predictable way as a function of membership in some group
evidence form distinct groups
Scores on the test undergoing construct validation tend to correlate highly in the predicted direction with scores on older, more established tests designed to measure the same (or a similar) construct
convergent evidence
Validity coefficient showing little relationship between test scores and/or other variables with which scores on the test should not theoretically be correlated
discriminant/divergent evidence
Class of mathematical procedures designed to identify specific variables on which people may differ
factor analysis
A factor inherent in a test that systematically prevents accurate, impartial measurement
bias
A judgment resulting from the intentional or unintentional misuse of a rating scale
rating error
Raters may be either too lenient, too severe, or reluctant to give ratings at the extremes
central tendency error
A tendency to give a particular person a higher rating than he or she objectively deserves because of a favorable overall impression
halo effect
The extent to which a test is used in an impartial, just, and equitable way
fairness