W2 - Chapter 6 - Validity (DN) Flashcards
base rate
- an index
- usually expressed as a proportion of the extent to which a particular trait, behaviour, characteristic, or attribute exists in a population
p. 193, 219
bias
- as applied to tests
- a factor inherent in a test which systematically prevents accurate, impartial measurement
p. 204-206
central tendency error
- a type of rating error
- a rater exhibits a general reluctance to rate at either the positive or negative extreme
- so all or most ratings end up in the middle of the rating continuum
p.205
concurrent validity
- a form of criterion-related validity
- an index of the degree that a test score relates to some criterion measure obtained at the same time (concurrently)
p. 190, 191-192
confirmatory factor analysis (CFA)
- class of mathematical procedures
- employed when an explicitly hypothesised factor structure is tested for its fit with an observed relationship between variables
p. 203, 345
construct
- an informed, scientific idea developed or generated to explain behaviour
- e.g., ‘intelligence’, ‘personality’, anxiety, and ‘job satisfaction’
p.119, 198
construct validity
- a judgement about the appropriateness of inferences drawn from test scores
- regarding individual standings on a variable (construct)
p.198-204
content validity
- a judgement regarding how adequately a test samples behaviour representative of the universe of behaviour it was designed to measure
p. 184-189
content validity ratio (CVR)
- a FORMULA developed by C. H. Lawshe
- used to GAUGE AGREEMENT among RATERS regarding how ESSENTIAL an INDIVIDUAL TEST ITEM is for INCLUSION in a test
p. 187-188
convergent evidence
- with reference to CONSTRUCT VALIDITY
- data from OTHER MEASUREMENT INSTRUMENTS designed to measure THE SAME or SIMILAR CONSTRUCT (as the test being ‘CONSTRUCT-VALIDATED’)
convergent validity
- data indicating that a test measures THE SAME CONSTRUCT as ANOTHER TEST purporting to measure the SAME CONSTRUCT
p. 202n3
criterion
- the STANDARD against which a test or test score is EVALUATED
- this standard may take many forms
- (e.g., a specific BEHAVIOUR, or SET of BEHAVIOURS)
p.139, 190, 421
criterion contamination
- a state in which a CRITERION MEASURE is ITSELF based, in whole or in part, on a PREDICTOR MEASURE p.190
criterion-related validity
- a JUDGEMENT regarding how ADEQUATELY a score or index on a test, or other measurement tool CAN be used to INFER an individual’s MOST PROBABLE standing on some measure of interest (the criterion) p. 190-198
discriminant evidence
- with reference to CONSTRUCT VALIDITY - DATA from a test or other measurement instrument SHOWING LITTLE RELATIONSHIP between test scores or other VARIABLES with which the scores should NOT theoretically be correlated; contrast with convergent evidence - In other words - there is little/no relationship where little/no relationship is expected! p.202-203
expectancy chart
- graphic representation of an EXPECTANCY TABLE p.196
expectancy data
- information, usually in the form of an EXPECTANCY CHART - illustrates the likelihood that an INDIVIDUAL TESTTAKER will score WITHIN some INTERVAL of SCORES on a CRITERION MEASURE p.196-198, 219,229
expectancy table
- information presented in TABULAR FORM - illustrates the likelihood that an INDIVIDUAL TESTTAKER will score WITHIN some INTERVAL of SCORES on a CRITERION MEASURE
exploratory factor analysis
- a class of MATHEMATICAL PROCEDURES - they are employed to ESTIMATE FACTORS, EXTRACT FACTORS, or DECIDE HOW MANY FACTORS TO RETAIN p.203
face validity
- a judgement (perception) based solely on ‘APPEARANCES’, regarding how well a test or other tool measures what it purports to measure - such as the content of the test’s items p. 183-184
factor analysis
- a class of MATHEMATICAL PROCEDURES - frequently employed as DATA REDUCTION methods - designed to IDENTIFY VARIABLES (factors) on which people may DIFFER p.203-204
factor loading
- in FACTOR ANALYSIS - a metaphor suggesting that a test (or test item) carries with it or ‘loads’ on a certain amount of ONE or MORE ABILITIES - that in turn have a DETERMINING INFLUENCE on the test score (or on response to an individual test item) p.203
fairness
- as applied to TESTS - the extent to which a test is used in an IMPARTIAL, JUST, and EQUITABLE way p.206-210
false negative
- a specific type of MISS - when an assessment tool indicates a testtaker DOES NOT possess or exhibit a particular trait, ability, behaviour, or attribute………when in fact, they DO POSSESS it p.193, 590
false positive
- an ERROR in measurement - when an assessment tool indicates a testtaker DOES POSSESS or EXHIBIT a particular trait, ability, behaviour or attribute…….when in fact, they DO NOT p.193, 590
generosity error
- also referred to as LENIENCY ERROR - a less than accurate rating or evaluation by a rater - due to the RATER’S general tendency to be LENIENT or INSUFFICIENTLY CRITICAL; contrast with severity error p.203, 403
halo effect
- a type of RATING ERROR - the RATER VIEWS the OBJECT being rated with EXTREME FAVOUR……and tends to bestow ratings INFLATED in a POSITIVE DIRECTION; - a set of circumstances resulting in a RATER’S tendency to be POSITIVELY DISPOSED and INSUFFICIENTLY CRITICAL p.206, 403
hit rate
- the PROPORTION of people who are ACCURATELY IDENTIFIED as POSSESSING or NOT POSSESSING a particular trait, behaviour, characteristic, or attribute BASED on TEST SCORES p.193
homogeneity
When a test contains ITEMS that MEASURE a SINGLE TRAIT i.e., the DEGREE to which a test measures a SINGLE FACTOR - i.e., the extent to which items in a scale are UNIFACTORIAL - the more HOMOGENEOUS a test the MORE INTER-ITEM CONSISTENCY - it is expected to have higher IC than a HETEROGENEOUS TEST - desirable as it provides straightforward INTERPRETATION (i.e., similar scores -= similar abilities on variable of interest) p.154-155
incremental validity
- used in conjunction with PREDICTIVE VALIDITY - an INDEX of the EXPLANATORY POWER of ADDITIONAL PREDICTORS over and above the predictors already in use p. 195-196
inference
- a LOGICAL RESULT or a DEDUCTION in a REASONING PROCESS p.181
intercept bias
- a reference to the INTERCEPT of a REGRESSION LINE exhibited by a test or measurement procedure that SYSTEMATICALLY UNDER-PREDICTS or OVER-PREDICTS the performance of members of a group; contrast with slope bias p.571
leniency error
- also referred to as GENEROSITY ERROR - a less than accurate rating or evaluation by a rater - due to the RATER’S general tendency to be LENIENT or INSUFFICIENTLY CRITICAL; contrast with severity error p.203, 403
local validation study
- the process of GATHERING EVIDENCE, relevant to HOW WELL a test measures what it PURPORTS to MEASURE PURPOSE: evaluating the VALIDITY of a TEST or other MEASUREMENT TOOL WHY: typically done in conjunction with a population DIFFERENT from the POPULATION for whom the test was ORIGINALLY validated. - basically validating it on a local (new) population p.182
method of contrasted groups
- a system of COLLECTING DATA on a PREDICTOR of INTEREST from groups KNOWN ‘TO POSSESS’ and to ‘NOT POSSESS’ a trait, attribute, or ability of interest p.236-237
miss rate
- the PROPORTION of people a test or other measurement procedure FAILS to IDENTIFY ACCURATELY with respect to the possession or exhibition of a trait, behaviour, characteristic, or attribute - a MISS in this context is an INACCURATE CLASSIFICATION or PREDICTION - may be sub-divided into FALSE POSITIVES and FALSE NEGATIVES p.193
multitrait-multimethod matrix
- a method of evaluating CONSTRUCT VALIDITY by simultaneously examining both CONVERGENT VALIDITY and DIVERGENT EVIDENCE HOW: by means of a TABLE of CORRELATIONS between TRAITS and METHODS p.203
predictive validity
- a form of CRITERION-RELATED VALIDITY - it is an INDEX of the DEGREE to which a test score PREDICTS some FUTURE CRITERION MEASURE p.190
ranking
- the ORDINAL ordering of persons, scores, or variables into RELATIVE POSITIONS or DEGREES of VALUE p.206
rating
- a NUMERICAL or VERBAL JUDGEMENT - it places a person or attribute along a CONTINUUM, identified by a scale of NUMERICAL or WORD DESCRIPTORS (called a RATING SCALE) p.205
rating error
- a JUDGEMENT that results from the intentional or unintentional MISUSE of a RATING SCALE - two types 1) LENIENCY (GENEROSITY) ERROR. 2) SEVERITY ERROR p.205
rating scale
- a SYSTEM of ORDERED NUMERICAL or VERBAL descriptors, on which judgements about the PRESENCE/ABSENCE or MAGNITUDE of a particular trait, attitude, emotion, or other variable are indicated by RATERS, judges, examiners, or (when the rating scale reflects self-report) the assessee p.205, 247, 371
severity error
- less than accurate rating or error in evaluation - due to the RATER’S tendency to be OVERLY CRITICAL; contrast with generosity error p.205, 403
slope bias
?
test blueprint
- a detailed plan of the CONTENT, ORGANISATION, and QUANTITY of the ITEMS that a test will contain p.184, 186
validation
- the process of GATHERING and EVALUATING validity evidence p.182
validation study
- research that entails GATHERING EVIDENCE relevant to HOW WELL a test measures what it PURPORTS to measure PURPOSE: EVALUATING the VALIDITY of a test or other measurement tool p.182
validity
- a JUDGEMENT regarding HOW WELL a test MEASURES what it PURPORTS to MEASURE - this judgement has important Implications regarding the APPROPRIATENESS of INFERENCES MADE and ACTIONS TAKEN on the basis of measurements p.125
validity coefficient
- a CORRELATION COEFFICIENT that provides a measure of the RELATIONSHIP between TEST SCORES and SCORES on a CRITERION MEASURE p.192-195