PA3 Flashcards

1
Q

Define validity and validation

A

Validity: a judgment or estimate of how well a test measures what it purports to measure.
Validation: the process of gathering and evaluating evidence about validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 4 main types of validity, and which of these is the most important “umbrella” term?

A

Face validity
Content validity
Criterion validity
Construct validity***

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is face validity?

A

How relevant the test items APPEAR to be.

If a test appears to measure what it purports to measure “on the face of it,” it has high face validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is content validity?

A

How adequately a test samples the whole “universe” of the behaviour or criterion that the test was designed to sample (including the types of information to be covered, the number of items tapping each area of coverage, the organisation of the items in the test, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe how Lawshe’s (1975) content validity ratio (CVR) works

A
  1. Select a panel of experts in the content area
  2. Ask them to rate each item as one of:
    a) essential
    b) useful but not essential
    c) not necessary
  3. Use the content validity ratio (CVR) formula:
    If CVR is:
    • Negative= fewer than half panelists chose “essential”
    • Zero=exactly half panelists chose “essential”
    • Positive =more than half panelists chose “essential”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe two ways content validity can be variable

A
  1. Can change over time as construct research evolves

2. Can vary between cultures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define criterion related validity and criterion

A

Criterion related validity: a judgment of how adequately a test score correlates with some measure of interest (i.e. the criterion). The criterion is an accepted standard against which a test or a test score is evaluated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe and give examples of the two main types of criterion related validity

A

Concurrent validity: an index of the degree to which a test score is related to some criterion measured at the same time, eg. client does two tests at the same session, one of which is new and one is the “gold standard”.
Predictive validity: an index of the degree to which a test score predicts some criterion, or outcome, measured in the future eg. how well does an IQ test predict future exam results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In criterion‐related validity, what is the validity coefficient?

A

A correlation coefficient that provides a measure of the relationship between test scores and scores on the criterion measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In criterion‐related validity, what is incremental validity?

A

The degree to which an added predictor in a test explains additional variation in the criterion (outcome) measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In criterion‐related validity, what is an expectancy table?

A

A table showing the proportions of people within different test‐score intervals who subsequently rated in various categories of the criterion (e.g. “passed” vs “failed” a job interview)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is construct validity?

A

The ability of a test to measure the theorized construct (e.g. intelligence, aggression, personality, etc.) that it purports to measure.
If a test is a valid measure of a construct, high scorers and low scorers should behave as theorized.
Al other types of validity “feed into” construct validity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are convergent evidence and discriminant evidence, in relation to construct validity?

A

Convergent evidence: scores on test undergoing construct validation correlate highly in the predicted direction with scores on older, more established, tests measuring the same (or a similar) construct.
Discriminant evidence: validity coefficient shows little relationship between test scores and other variables with which scores on the test should NOT theoretically be correlated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Apart from the main types of convergent and discriminant evidence, what are 4 other types of evidence for construct validity, and what do they mean?

A
  1. Evidence of homogeneity: how uniform a test is in measuring a single concept.
  2. Evidence of changes with age: some constructs are expected to change over time (e.g. reading rate).
    Evidence of pretest/posttest changes: test scores change as a result of some experience between a pretest and a posttest (e.g. therapy).
    Evidence from distinct groups: scores on a test vary in predictable way as function of membership to a group (e.g. impulsivity should be higher in substance users)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Out of the different main types of validity, which is most closely and least closely linked to construct validity?

A

Most linked: criterion validity

Least linked: face validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In test validity, what does bias mean?

A

A factor inherent in a test that systematically prevents accurate, impartial measurement. Implies systematic variation (bad) as opposed to random variation (expected)

17
Q

In test validity, what does rating error mean? Describe two common types.

A

Rating error: judgment resulting from intentional or unintentional misuse of a rating scale eg:

  1. Raters may be too lenient, too severe, or reluctant to give ratings at the extremes (“central tendency error”).
  2. Halo effect: tendency to give particular person higher rating than they deserve because of a favourable overall impression.
18
Q

In test validity, what does “fairness” mean? Describe two common types.

A

The extent to which a test is used in an impartial, just, and equitable way.

19
Q

In test terminology, what is “utility”?

A

The usefulness or practical value of a test

20
Q

Describe three of the main factors effecting test utility

A
  1. Psychometric soundness (generally, higher validity = greater utility. But valid tests not always useful.)
  2. Costs:
    - Economic costs e.g., purchasing test, training etc
    • Non‐economic costs e.g., time, ethics etc
  3. Benefits eg cheaper, better data, more reliable, more valid, better for a specific population
21
Q

In testing, what is utility analysis?

A

A family of techniques that entail a cost–benefit analysis to assist in decision about usefulness of the assessment tool.

22
Q

In utility analysis, what is expectancy data?

A

The likelihood that a test-taker will score within some interval of scores on a criterion measure. Various reference tables have been created to check expectancy data.

23
Q

In testing, what is a cut score and why is it’s accuracy important?

A

A score on a continuous scale that will be used to differentiate people into categorical outcomes.
Accuracy of cut score will affect reliability and validity of outcomes.

24
Q

What is a relative cut score?

A

A cut score determined in reference to normative data

25
Q

What is a fixed cut score?

A

A cut score based on a minimum acceptable level

26
Q

What are multiple cut scores?

A

The use of multiple cut points for a single predictor e.g. grades A, B, C, D; or categorized outcomes such as mild, moderate, severe

27
Q

In testing, what are multiple hurdles

A

A requirement to achieve a lower cut point or cut score before advancing to the next stage of testing

28
Q

What is the Angoff Method for setting cut scores?

A

Recruiting a panel of experts to sit the test as hypothetical clients with certain characteristics, and then averaging the expert scores to yield the cut scores for the test.

29
Q

What is the Known Groups Method for setting cut scores?

A

Collecting data on the predictor of interest from groups known to possess (and not to possess) a trait, attribute, or ability of interest. Works for some variables (eg clinical issues such as depression) but not others (eg impulsivity)

30
Q

What is the IRT method for setting cut scores?

A

Item Response Theory - based on item difficulty ratings. In order to “pass” the test, test taker must answer items deemed above some minimum level of difficulty worked out using IRT

31
Q

What is the “discriminant analysis” method for setting cut scores?

A

Statistical techniques used to quantify how well a set of identified variables (such as scores on a battery of tests) can predict membership to groups of interest

32
Q

What is the “receiver operating curves” method for setting cut scores, and what are the two main factors used in this method?

A

Derives the sensitivity and specificity associated with different cutpoints:
Sensitivity = proportion of people correctly identified as having condition
Specificity = proportion of people correctly identified as not having the condition

33
Q

What is the Youden Index?

A

Index used to select appropriate cutpoint based on

maximizing the sensitivity and specificity of a test

34
Q

What is a base rate and why is it important in setting a cutpoint?

A

Base rate = the true prevalence of the condition in the population. Base rate can have a big effect on the validity of different cutpoints.