Module 3: Validity and Utility Flashcards

Question 1

Q

Validity

Answer

A

a judgement or estimate of how well a test measures what it purports to measure.

Question 2

Q

Validation

Answer

A

the process of gathering evaluation evidence about validity.

Question 3

Q

Validity is often conceptualised as three categories:

Answer

A

*Face validity: the test appears to cover relevant content
Content validity: based on evaluation of content covered by a test
Criterion validity: obtained by evaluating the relationship between scores on your test and other tests/measures.
Construct validity: Arrived at by comprehensive analysis of:
o How score on the test relate to other scores and measures, and
o How scores on the test can be understood within some theoretical framework for understanding construct test was designed to measure

Question 4

Q

Face Validity

Answer

A

Face Validity: is a judgement concerning how relevant the test items appear to be.

If a test appears to measure what it purports to measure ‘on the face of it’, it has 
high face validity.

Do these have high face validity? o	Personality tests? (e.g., NEO?) YES o	Rorscharch ink blot? LOW o	IQ tests? SOME YES SOME NO

Question 5

Q

Content Validity

Answer

A

Content validity: a judgement of how adequately a test samples behaviour representative of the universe of behaviour test was designed to sample.

Test blueprint: a plan regarding the types of information to be covered by the items, the number of items tapping each area of coverage, the organisation of the items in the test etc.,

Lawshe’s (1975) content validity ratio (CVR)

Select set of panel members who are experts in the content area
Ask panellists to rate each item as one of
a. Essential
b. Useful but not essential
c. Not necessary
Use content validity ratio (CVR) formula, given by:

Question 6

Q

Criterion Validity

Answer

A

A criterion is the standard against which a test or a test score is evaluated.

Characteristics of adequate criterion:

Relevant for the matter at hand
Valid for the purpose for which it is being used
Uncontaminated (i.e., it is not part of the predictor)

Question 7

Q

The validity coefficient

Answer

A

The validity coefficient: a correlation coefficient that provides a measure of the relationship between test scores and score on the criterion measure.

Question 8

Q

Incremental validity

Answer

A

the degree to which an additional predictor explains additional variation in the criterion measure. Is your test a good valid test having utility beyond existing tests.

Question 9

Q

Expectancy table

Answer

A

shows proportion of people within test-score intervals who subsequently rated in various categories of the criterion (e.g., ‘passed’ vs ‘failed’ category)

Question 10

Q

Construct validity

Answer

A

Construct validity: ability of test to measure theorised construct (e.g., intelligence, aggression, personality, etc.) that it purports to measure.

If a test is a valid measure of a construct, high score and low scores should behave as 
theorised.

All types of validity evidence, including evidence from the content- and criterion-related varieties of validity, come under the umbrella of construct validity.

Question 11

Q

Evidence of homogeneity

Answer

A

Evidence of homogeneity: how uniform a test is in measure a single concept

Question 12

Q

Evidence of changes with age

Answer

A

Evidence of changes with age: some constructs are expected to change over time (e.g., reading rate).

Question 13

Q

Evidence of pretest/posttest changes:

Answer

A

Evidence of pretest/posttest changes: test scores change as a result of some experience between a pretest and a posttest (e.g., therapy)

Question 14

Q

Evidence from distinct groups:

Answer

A

Evidence from distinct groups: scores on a test vary in predictable way as function of membership to a group (e.g., impulsivity should be higher in substance users).

Question 15

Q

Convergent validity:

Answer

A

Convergent evidence: scores on a test undergoing construct validation tend to correlate highly in predicted direction with scores on older, more established, tests designed to measure the same (or similar) constructs.

Question 16

Q

Discriminant evidence:

Answer

Study These Flashcards

A

Discriminant evidence: validity coefficient shows little relationship between test scores and other variables with which scores on the test should not theoretically be correlated.

Question 17

Q

Validity and test bias

Answer

Study These Flashcards

A

Bias: a factor inherent in a test that systematically prevents accurate, impartial measurement
- Bias implies systematic variation in test scores.

Fairness: the extent to which a test is used in an impartial, just, and equitable way.

Rating error: judgement resulting from intentional or unintentional misuse of a rating scale.

Raters may be too lenient, too severe, or reluctant to give ratings at the extremes (central tendency error).
Halo effect: tendency to give particular person higher rating than objectively deserves because of a favourable overall impression.

Question 18

Q

Utility of tests

Answer

Study These Flashcards

A

Utility: the usefulness or practical value of testing to improve efficiency.

Question 19

Q

Factors affecting utility

Answer

Study These Flashcards

A

Psychometric soundness:

Generally higher validity = greater utility
But, many factors affect utility and utility assessed in different ways.
Valid tests not always useful

Costs:
- Economic costs? E.g., purchasing a test and scoring sheets, training programs, software, hardware, cost of not using the best test.
- Non-economic costs? E.g., time, ethical considerations, face validity, poor data acquisition
Benefits?

Do the benefits justify the costs?

What are the profits, gains, or advantages?
Better data?
More reliable assessment?
Increased validity of measurement?
Appropriate testing for your population (e.g., specific norms)?
Non-economic benefits e.g., cutting edge assessment?

Question 20

Q

Utility analysis

Answer

Study These Flashcards

A

Utility Analysis: family of techniques that entail a cost-benefit analysis to assist in decision about usefulness of assessment tool.

Some utility tests straightforward, others are more sophisticated (e.g., using mathematical models).
Often utility tests address the question of ‘which test gives us the most bang for the buck?’
Endpoint of utility analysis yields educated decision as to which several alternative courses of action is most optimal (in terms of costs and benefits).

Expectancy data: likelihood that a testtaker will score within some interval of scores on a criterion measure.

Question 21

Q

Determining cut score/cut points

Answer

Study These Flashcards

A

Cut scores: what score will be used to differentiate people on your test? (i.e., only for categorical outcomes)

Relative cut scores: determine in reference to normative data
Fixed cut scores: made on the basis of minimum acceptable level.
Multiple cut scores: use of multiple cut points, for a single predictor (e.g., grades A, B, C, etc; categorised outcomes mild, moderate etc.).
Multiple hurdles: need to achieve lower cut point before advancing to the next stage of testing.

Question 22

Q

Methods of setting cut scores

Answer

Study These Flashcards

A

The Angoff Method: judgements of experts are averaged to yield cut scores for the test.

The Known Groups Method: entail collection of data on the predictor of interest from groups known to possess, and to possess, a trait, attribute, or ability of interest.

After analysis of a data, a cut score is chosen that best discriminates the groups
One problem with the known groups method is how do you know which ‘known groups’ to select?

Module 3: Validity and Utility Flashcards

(22 cards)