Module 3: Validity and Utility Flashcards by Molly McVean

Validity:

a judgement or estimate of how well a test measures what is purports to measure.

Does it measure what is says it measures/

How well did you know this?

Not at all

Perfectly

Validation:

the process of gathering and evaluating evidence about validity.

How well did you know this?

Not at all

Perfectly

Face Validity:

the test appears to cover relevant content.

How well did you know this?

Not at all

Perfectly

Content validity: See Notes

based on the evaluation of content covered by a test.

How well did you know this?

Not at all

Perfectly

Criterion validity:

obtained by evaluating relationship between scores on your test and other tests/measures.

How well did you know this?

Not at all

Perfectly

Construct validity:

the super ordinate the over ties all validity- all validity is construct validity.

arrived at by comprehensive analysis of:

a. How scores on the test relate to other test scores and measures, and
b. How scores on the test can be understood within some theoretical framework for understanding construct test was designed to measure.

How well did you know this?

Not at all

Perfectly

Face validity:

a judgment concerning how relevant the test items appear to be.
• If a test appears to measure what it purports to measure on the face of it, it has face validity.
• The patient could be offended/turned off if it doesn’t have face validity.
• It is not necessary.

How well did you know this?

Not at all

Perfectly

Test blueprint:

a plan regarding the types of information to be covered by the items, the number of items tapping each area of coverage, the organization of the items in the test

How well did you know this?

Not at all

Perfectly

Criterion- related validity:

a criterion is the standard against which a test or a test score is evaluated.
• Characteristics of an adequate criterion:
o A relevant for the matter at hand.
o Valid for the purpose for which it is being used.
o Uncontaminated (i.e., it is not part of the predictor).

How well did you know this?

Not at all

Perfectly

The validity coefficient:

a correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure.

How well did you know this?

Not at all

Perfectly

Incremental validity:

the degree to which an additional predictor explains additional variation in the criterion measure.

How well did you know this?

Not at all

Perfectly

2Construct validity:

ability to test to measure theorized construct (e.g., intelligence, aggression, personality, etc.) that it purports to measure.
• If test is validity measure of construct, high scores and low scorers should behave as theorized.
• All types of validity evidence, including evidence from the content and criterion-related varieties of validity, come under the umbrella of construct validity.

How well did you know this?

Not at all

Perfectly

Evidence of construct validity:

• Evidence of homogeneity: how uniform a test is in measuring a single concept.
• Evidence of changes with age: some constructs are expected to change over time (e.g., reading late).
• Evidence of pretest/post-test changes: test scores change as a result of some experience between a pretest and post-test (e.g., therapy).
• Evidence from distinct groups: scores on a test vary in predictable way as function of membership to a group (e.g., impulsivity should be higher in substance users).
• Convergent evidence: scores on a test undergoing construct validation tend to correlate highly in predicted direction with scores on older, more established, tests designed to measure the same (or similar) construct.
• Discriminant evidence: validity coefficients show little relationship between test scores and other variables with which scores on the test should not theoretically be correlated.
o WE DO NOT NEED FACE VALIDITY

How well did you know this?

Not at all

Perfectly

Bias:

a factor inherent in a test that systematically prevents accurate, impartial measurement.
• Bias implies systematic variation in test scores.
• Only want random error.
• DO not want systematic error.

How well did you know this?

Not at all

Perfectly

Fairness:

the extent to which a test is used in an impartial, just, and equitable way.

How well did you know this?

Not at all

Perfectly

Rating error:

Study These Flashcards

judgment resulting from intentional or unintentional misuse of a rating scale.
• Raters may be too lenient, too severe, or reluctant to give ratings at the extremes (central tendency error).
• Halo effect: tendency to give particular person higher rating than objectively deserves because of favorable overall impression.

Halo effect:

Study These Flashcards

tendency to give particular person higher rating than objectively deserves because of favorable overall impression.

Utility:

Study These Flashcards

the usefulness or practical value of testing to improve efficiency.

Psychometric soundness:

Study These Flashcards

Generally, high validity = greater utility.
But many factors affect utility and utility assessed in different ways.
Valid tests not always useful

Types of Costs:

Study These Flashcards

Economic costs? E.g., purchasing test and scoring sheeting, training programs, software/hardware, cost of not using the best test.
Non-economic costs? E.g., time, ethical considerations, face validity, poor data acquisition.

Utility analysis:

Study These Flashcards

family of techniques that entail a cost-benefit analysis to assist in decision about useful of assessment tool.
• Some utility tests straightforward, others are more sophisticated (e.g., using meth models).
• Often utility tests address the question of “which test gives us the most bang for the buck?
• Endpoint of utility analysis yields educated decision as to which of several alternative courses of action is most optimal (in terms of costs and benefits).

Expectancy data:

Study These Flashcards

likelihood that a test taker will score within some interval of scores on a criterion measure.

Cut scores:

Study These Flashcards

what score will be used to differentiate people on your test? I.e., only for categorical outcomes.

Relative cut scores:

Study These Flashcards

determined in reference to normative data.

Fixed cut scores

made on basis of minimum acceptable level. E.g., driving test

Multiple cut scores:

use of multiple cut points for a single predictor (e.g., grades A, B, C, etc., categorized outcomes mild, moderate).

Multiple hurdles:

need to achieve lower cut point before advancing to next stage of testing.

Methods of setting cut scores:

* The Angoff method * The Known groups method * IRT Based Methods * Discriminant analysis * Receiver operating curves * Youden index

The Angoff method:

judgements of experts are averaged to yield cut scores for the test.

The Known groups method

entails collection of data on the predictor of interest from groups known to possess, and not to possess, a trait, attribute, or ability of interest. o After analysis of data, a cut score is chosen that best discriminates the groups. o One problem with known groups method is how do you know which “known groups” to select?

IRT Based Methods:

use the item difficulty parameter. o In order to “pass” the test, test taker must answer items deemed above some minimum level of difficulty (hence a specific point along the latent trait (theta).

Discriminant analysis:

statistical techniques used to quantify how well a set of identified variables (such as scores on a battery of tests) can predict membership to groups of interest.

Receiver operating curves

derives the sensitivity and specificity associated with different cut points that classify individuals as having or not having a condition of interest. o Sensitivity: proportion of people correctly identified as having condition. o Specificity: proportion of people correctly identified as not having the condition.

Youden index:

can be used to select appropriate cut point based on maximising sensitivity and specificity of a test.

Why Cut Points Matter:

• Accuracy of classification will affect reliability and validity. • Need to consider base rates when selecting cut points and interpreting scores. o Base rate: the true prevalence of the condition in the population

Module 3: Validity and Utility Flashcards

(35 cards)