W2 - Chapter 6 - Validity (DN) Flashcards
1
Q
base rate
A
- an index
- usually expressed as a proportion of the extent to which a particular trait, behaviour, characteristic, or attribute exists in a population
p. 193, 219
2
Q
bias
A
- as applied to tests
- a factor inherent in a test which systematically prevents accurate, impartial measurement
p. 204-206
3
Q
central tendency error
A
- a type of rating error
- a rater exhibits a general reluctance to rate at either the positive or negative extreme
- so all or most ratings end up in the middle of the rating continuum
p.205
4
Q
concurrent validity
A
- a form of criterion-related validity
- an index of the degree that a test score relates to some criterion measure obtained at the same time (concurrently)
p. 190, 191-192
5
Q
confirmatory factor analysis (CFA)
A
- class of mathematical procedures
- employed when an explicitly hypothesised factor structure is tested for its fit with an observed relationship between variables
p. 203, 345
6
Q
construct
A
- an informed, scientific idea developed or generated to explain behaviour
- e.g., ‘intelligence’, ‘personality’, anxiety, and ‘job satisfaction’
p.119, 198
7
Q
construct validity
A
- a judgement about the appropriateness of inferences drawn from test scores
- regarding individual standings on a variable (construct)
p.198-204
8
Q
content validity
A
- a judgement regarding how adequately a test samples behaviour representative of the universe of behaviour it was designed to measure
p. 184-189
9
Q
content validity ratio (CVR)
A
- a FORMULA developed by C. H. Lawshe
- used to GAUGE AGREEMENT among RATERS regarding how ESSENTIAL an INDIVIDUAL TEST ITEM is for INCLUSION in a test
p. 187-188
10
Q
convergent evidence
A
- with reference to CONSTRUCT VALIDITY
- data from OTHER MEASUREMENT INSTRUMENTS designed to measure THE SAME or SIMILAR CONSTRUCT (as the test being ‘CONSTRUCT-VALIDATED’)
11
Q
convergent validity
A
- data indicating that a test measures THE SAME CONSTRUCT as ANOTHER TEST purporting to measure the SAME CONSTRUCT
p. 202n3
12
Q
criterion
A
- the STANDARD against which a test or test score is EVALUATED
- this standard may take many forms
- (e.g., a specific BEHAVIOUR, or SET of BEHAVIOURS)
p.139, 190, 421
13
Q
criterion contamination
A
- a state in which a CRITERION MEASURE is ITSELF based, in whole or in part, on a PREDICTOR MEASURE p.190
14
Q
criterion-related validity
A
- a JUDGEMENT regarding how ADEQUATELY a score or index on a test, or other measurement tool CAN be used to INFER an individual’s MOST PROBABLE standing on some measure of interest (the criterion) p. 190-198
15
Q
discriminant evidence
A
- with reference to CONSTRUCT VALIDITY - DATA from a test or other measurement instrument SHOWING LITTLE RELATIONSHIP between test scores or other VARIABLES with which the scores should NOT theoretically be correlated; contrast with convergent evidence - In other words - there is little/no relationship where little/no relationship is expected! p.202-203
16
Q
expectancy chart
A
- graphic representation of an EXPECTANCY TABLE p.196
17
Q
expectancy data
A
- information, usually in the form of an EXPECTANCY CHART - illustrates the likelihood that an INDIVIDUAL TESTTAKER will score WITHIN some INTERVAL of SCORES on a CRITERION MEASURE p.196-198, 219,229
18
Q
expectancy table
A
- information presented in TABULAR FORM - illustrates the likelihood that an INDIVIDUAL TESTTAKER will score WITHIN some INTERVAL of SCORES on a CRITERION MEASURE
19
Q
exploratory factor analysis
A
- a class of MATHEMATICAL PROCEDURES - they are employed to ESTIMATE FACTORS, EXTRACT FACTORS, or DECIDE HOW MANY FACTORS TO RETAIN p.203
20
Q
face validity
A
- a judgement (perception) based solely on ‘APPEARANCES’, regarding how well a test or other tool measures what it purports to measure - such as the content of the test’s items p. 183-184
21
Q
factor analysis
A
- a class of MATHEMATICAL PROCEDURES - frequently employed as DATA REDUCTION methods - designed to IDENTIFY VARIABLES (factors) on which people may DIFFER p.203-204
22
Q
factor loading
A
- in FACTOR ANALYSIS - a metaphor suggesting that a test (or test item) carries with it or ‘loads’ on a certain amount of ONE or MORE ABILITIES - that in turn have a DETERMINING INFLUENCE on the test score (or on response to an individual test item) p.203
23
Q
fairness
A
- as applied to TESTS - the extent to which a test is used in an IMPARTIAL, JUST, and EQUITABLE way p.206-210
24
Q
false negative
A
- a specific type of MISS - when an assessment tool indicates a testtaker DOES NOT possess or exhibit a particular trait, ability, behaviour, or attribute………when in fact, they DO POSSESS it p.193, 590
25
false positive
- an ERROR in measurement - when an assessment tool indicates a testtaker DOES POSSESS or EXHIBIT a particular trait, ability, behaviour or attribute.......when in fact, they DO NOT p.193, 590
26
generosity error
- also referred to as LENIENCY ERROR - a less than accurate rating or evaluation by a rater - due to the RATER'S general tendency to be LENIENT or INSUFFICIENTLY CRITICAL; contrast with severity error p.203, 403
27
halo effect
- a type of RATING ERROR - the RATER VIEWS the OBJECT being rated with EXTREME FAVOUR......and tends to bestow ratings INFLATED in a POSITIVE DIRECTION; - a set of circumstances resulting in a RATER'S tendency to be POSITIVELY DISPOSED and INSUFFICIENTLY CRITICAL p.206, 403
28
hit rate
- the PROPORTION of people who are ACCURATELY IDENTIFIED as POSSESSING or NOT POSSESSING a particular trait, behaviour, characteristic, or attribute BASED on TEST SCORES p.193
29
homogeneity
When a test contains ITEMS that MEASURE a SINGLE TRAIT i.e., the DEGREE to which a test measures a SINGLE FACTOR - i.e., the extent to which items in a scale are UNIFACTORIAL - the more HOMOGENEOUS a test the MORE INTER-ITEM CONSISTENCY - it is expected to have higher IC than a HETEROGENEOUS TEST - desirable as it provides straightforward INTERPRETATION (i.e., similar scores -= similar abilities on variable of interest) p.154-155
30
incremental validity
- used in conjunction with PREDICTIVE VALIDITY - an INDEX of the EXPLANATORY POWER of ADDITIONAL PREDICTORS over and above the predictors already in use p. 195-196
31
inference
- a LOGICAL RESULT or a DEDUCTION in a REASONING PROCESS p.181
32
intercept bias
- a reference to the INTERCEPT of a REGRESSION LINE exhibited by a test or measurement procedure that SYSTEMATICALLY UNDER-PREDICTS or OVER-PREDICTS the performance of members of a group; contrast with slope bias p.571
33
leniency error
- also referred to as GENEROSITY ERROR - a less than accurate rating or evaluation by a rater - due to the RATER'S general tendency to be LENIENT or INSUFFICIENTLY CRITICAL; contrast with severity error p.203, 403
34
local validation study
- the process of GATHERING EVIDENCE, relevant to HOW WELL a test measures what it PURPORTS to MEASURE PURPOSE: evaluating the VALIDITY of a TEST or other MEASUREMENT TOOL WHY: typically done in conjunction with a population DIFFERENT from the POPULATION for whom the test was ORIGINALLY validated. - basically validating it on a local (new) population p.182
35
method of contrasted groups
- a system of COLLECTING DATA on a PREDICTOR of INTEREST from groups KNOWN 'TO POSSESS' and to 'NOT POSSESS' a trait, attribute, or ability of interest p.236-237
36
miss rate
- the PROPORTION of people a test or other measurement procedure FAILS to IDENTIFY ACCURATELY with respect to the possession or exhibition of a trait, behaviour, characteristic, or attribute - a MISS in this context is an INACCURATE CLASSIFICATION or PREDICTION - may be sub-divided into FALSE POSITIVES and FALSE NEGATIVES p.193
37
multitrait-multimethod matrix
- a method of evaluating CONSTRUCT VALIDITY by simultaneously examining both CONVERGENT VALIDITY and DIVERGENT EVIDENCE HOW: by means of a TABLE of CORRELATIONS between TRAITS and METHODS p.203
38
predictive validity
- a form of CRITERION-RELATED VALIDITY - it is an INDEX of the DEGREE to which a test score PREDICTS some FUTURE CRITERION MEASURE p.190
39
ranking
- the ORDINAL ordering of persons, scores, or variables into RELATIVE POSITIONS or DEGREES of VALUE p.206
40
rating
- a NUMERICAL or VERBAL JUDGEMENT - it places a person or attribute along a CONTINUUM, identified by a scale of NUMERICAL or WORD DESCRIPTORS (called a RATING SCALE) p.205
41
rating error
- a JUDGEMENT that results from the intentional or unintentional MISUSE of a RATING SCALE - two types 1) LENIENCY (GENEROSITY) ERROR. 2) SEVERITY ERROR p.205
42
rating scale
- a SYSTEM of ORDERED NUMERICAL or VERBAL descriptors, on which judgements about the PRESENCE/ABSENCE or MAGNITUDE of a particular trait, attitude, emotion, or other variable are indicated by RATERS, judges, examiners, or (when the rating scale reflects self-report) the assessee p.205, 247, 371
43
severity error
- less than accurate rating or error in evaluation - due to the RATER'S tendency to be OVERLY CRITICAL; contrast with generosity error p.205, 403
44
slope bias
?
45
test blueprint
- a detailed plan of the CONTENT, ORGANISATION, and QUANTITY of the ITEMS that a test will contain p.184, 186
46
validation
- the process of GATHERING and EVALUATING validity evidence p.182
47
validation study
- research that entails GATHERING EVIDENCE relevant to HOW WELL a test measures what it PURPORTS to measure PURPOSE: EVALUATING the VALIDITY of a test or other measurement tool p.182
48
validity
- a JUDGEMENT regarding HOW WELL a test MEASURES what it PURPORTS to MEASURE - this judgement has important Implications regarding the APPROPRIATENESS of INFERENCES MADE and ACTIONS TAKEN on the basis of measurements p.125
49
validity coefficient
- a CORRELATION COEFFICIENT that provides a measure of the RELATIONSHIP between TEST SCORES and SCORES on a CRITERION MEASURE p.192-195