Chapter 5-validity Flashcards

1
Q

Validity def

A

The extent to which a test measures the attribute/construct it is designed to measure
Does the test measure what it was designed to measure?
-> Not a yes or no question - question of DEGREE
-> One of the most important characteristics of the test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Guidelines regarding validity (3)

A

(1) Do NOT accept a test’s name as an indicator of what the test measures.
(2) Validity is NOT a yes/no decision
(3) Validity evidence tells how well the test measures what it is intended to measure.
-> Diff types of evidence can be generated for diff types of validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do we mean when we say that “Validity is NOT a yes/no decision”

A
  • It comes in degrees and applies to a particular USE and a particular POPULATION
  • It is a process: An ongoing, dynamic effort to accumulate evidence for a sound scientific basis for proposed test score interpretations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 Types of Validity

A

Content, Criterion, Construct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Subtypes of Criterion validity

A

Concurrent, Predictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Subtypes of Construct validity

A

Convergent, Divergent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Face Validity

A

Whether a test appears to measure what it is supposed to measure (does it appear valid).
Mere appearance that a measure has validity.
=> Not sufficient evidence of validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A test with high face validity may: (3)

A

(1) Induce cooperation and positive motivation before and during test administration
(2) Reduce dissatisfaction and feelings of injustice among low scorers
(3) Convince policymakers, employers, and administrators to implement the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

There are situations where test designers make ON PURPOSE a test with low face validity. Why?

A

Sometimes a test with low face validity can elicit more honest responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Content Validity def

A

Degree to which ELEMENTS OF A TEST are representative of the domain/construct of interest.
-> Evaluate how adequately the test samples the domain or content of the construct.
-> More QUALITATIVE than Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Establishing content validity (3)

A

(1) Describe the content domain: Identify the boundaries of the content domain + Determine the structure of the content domain.
(2) Inspect test - Expert judgment
(3) Form judgment that the test measures what it is supposed to measure… without gathering any external evidence
+ Content of the items must be carefully evaluated (wording appropriateness…).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When is content validity high?

A

When test content is a representative sample of the tasks that define the content domain
+
When the items do not measure something else

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

However, content validity is not enough to determine that the test is valid. Why?

A

No information about relation of test to external constructs or external variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Criterion Validity def

A

Effectiveness of the test in predicting narrowly/specifically identified variables that are thought to be DIRECT measures of the construct.
-> How well a test corresponds with a particular criterion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Criterion def

A

A standard that researchers use to measure outcomes such as performance, attitude, performance.
-> Standard against which the test is compared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Objective criterion characteristics (2)

A

Observable and Measurable
E.g., Number of accidents, days of absence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Types of criterion

A

Objective & Subjective criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Subjective criterion

A

Based on a person’s judgement
E.g., Supervisor ratings, peer ratings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Concurrent Validity

A

Comes from assessments of the simultaneous relationship between the test and the criterion.
Criterion available at THE same time as test
-> Can also be used when a person does NOT know how they will respond to the criterion measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Predictive validity

A

The forecasting function of tests.
Degree to which test scores accurately predict scores on a criterion measure.
-> Criterion measure available in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What happens if the criterion measures FEWER dimensions than those measured by the test?

A

This decreases the evidence of validity based on its content because it has underrepresented some important characteristics
-> Underrepresentative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Criterion contamination def

A

If the criterion measures MORE dimensions than those measured by the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Validity coefficient def

A

Relationship between a test and a criterion.
-> Tells the extent to which the test is valid for making statements about the criterion.

24
Q

What’s the range of validity coefficients?

A

Validity coefficients: Correlation between test and criterion
-> Rarely greater than r = .60
-> If higher than that → alternative test

25
Q

Comparison of validity coefficient in psychology and medicine

A

Psychological tests can provide information that is AS VALID as common medical tests

26
Q

Factors Limiting Validity Coefficients (3)

A

(1) Range of Scores
(2) Unreliability of Test Scores
(3) Unreliability in Criterion

27
Q

[Factors Limiting Validity Coefficients] Explain (1) Range of Scores

A

Restricted range of scores decreases validity coefficients because it diminishes the Test score & criterion score correlation

28
Q

Explain (2) Unreliability of Test Scores and how we deal with it

A

Low reliability decreases validity coefficients.
Solution: Correction for attenuation - validity coefficient if we had perfect realibility of test scores

29
Q

Explain (3) Unreliability in Criterion and how we deal with it

A

Low reliability decreases validity coefficients
Solution: Correction for attenuation - Correcting for unreliability in test (predictor) & criterion

30
Q

What’s the correction for attenuation - Validity coeff

A

Observed validity coeff / sqrt(reliability coefficients)
-> See what would the validity coeff be if test and/or criterion were more reliable

31
Q

Take aways of Criterion Validity (2)

A

(1) Importance of choosing appropriate criterion
(2) Small validity coeff can have practical utility

32
Q

Construct Validity def

A

“All encompassing” type of validity. Process of Construct validation
-> In construct validity evidence, no single variable can serve as the criterion.
-> Extent to which your test or measure accurately assesses what it’s supposed to

33
Q

Construct validation def

A

Assembling evidence about what a test means. Done by showing the relationship between a test and other tests and measures.
-> Over a series of studies, the meaning of the test gradually begins to take shape.

34
Q

‘Limitations’ of criterion validity

A

For some constructs it’s easier, other it’s difficult to find a criterion.
-> E.g. criterion for trust, chronic pain (…)? Hard to identify appropriate criterion.
-> Thus, other ways of establishing validity of test scores → construct validity

35
Q

Psychological Constructs def (4)

A

(1) Abstract concepts used to refer to psychological attributes (e.g. intelligence, love, beauty)
(2) Exist in theory; generally LATENT; not directly observable (EXCEPTIONS: e.g. reaction time, absenteeism)
(3) Important in describing or understanding human behavior -> Its existence EXPLAINS why/how something is happening
(4) Can observe and measure the behaviors that show evidence of these constructs

36
Q

Construct Explication

A

How a particular construct is manifested and how such manifestations can be measured.

37
Q

How to gather Evidence of Construct Validity (2)

A

(1) Gathering Theoretical evidence
(2) Gather Psychometric evidence

38
Q

Explain how we gather THEORETICAL evidence of construct validity (2)

A

(1) Establish nomological network - identifying all possible relationships
(2) Based on this theoretical work → Propose experimental hypotheses
-> If what we think is true, what would be the evidence to support this relationship

39
Q

Nomological Network (3)

A

Consists of:
(1) Constructs (e.g. job satisfaction)
(2) Their observable manifestations (e.g. smiles, productivity, positive feedback)
(3) The relations within and between constructs and their observable manifestations (e.g. positive feedback related to productivity)

40
Q

Explain how we gather PSYCHOMETRIC evidence (6)

A

(1) Content validity
(2) Criterion validity
(3) Reliability of the test
(4) Experimental interventions
(5) Convergent evidence of validity
(6) Discriminant evidence of validity

41
Q

[Gathering psychometric evidence] Evidence of validity based on content (2)

A

(1) No construct underrepresentation: Does the test sample adequately from the construct domain?
(2) No irrelevant construct representation: Does the test properly exclude content that is unrelated to the construct?

42
Q

[Gathering psychometric evidence] Evidence of validity based on relations with criteria

A

Are the relations of the test with the external criteria as would expected based on theory?

43
Q

[Gathering psychometric evidence] Evidence of validity based on reliability of the test

A

E.g. test-retest/internal consistency not too low or too high given the construct

44
Q

[Gathering psychometric evidence] Evidence of validity based on experimental interventions

A

Provide evidence of situational changes that should influence test scores based on theory
-> E.g. education influencing scores on an achievement test
-> Medication influencing scores on an anxiety test

45
Q

[Gatheting psychometric evidence] Convergent Validity (2)

A

Extent to which two measures that are supposed to be related are actually correlated
When a test scores correlate with:
(1) Other measures of the SAME construct, or
(2) Measures of constructs to which the test should be related based on theory (think nomologic net)

46
Q

[Gatheting psychometric evidence] Discriminant (Divergent) Validity

A

Test scores are uncorrelated with:
Measures of constructs to which the construct should NOT be related based on theory (think nomologic net)

47
Q

Problems with Content validity (3)

A

(1) Educational setting: content validity has been of greatest concern in educational testing (score on this test represent comprehension of subject) BUT many factors can limit performance on test
(2) Unclear boundaries: hard to separate types of validities
-> It’s often hard to separate “content coverage” (content validity) from whether the test actually measures the underlying concept (construct validity), leading to blurred boundaries.
(3) Doesn’t consider the relationship of contruct w external variables/constructs

48
Q

[Problems with Content validity] How can we answer unclear boundaries: hard to separate types of validities?

A

Content validity evidence offers some unique features. Logical rather than statistical.

49
Q

Construct underrepresentation

A

CONTENT validity. Failure to capture important components of a construct.

50
Q

Construct-irrelevant variance

A

CONTENT validity. Occurs when scores are influenced by factors irrelevant to the construct.
-> E.g., a test of intelligence might be influenced by reading comprehension, test anxiety, or illness.

51
Q

Would a validity coeff of .40 always be considered good?

A

NO. All validity coefficient don’t have the same meaning.

52
Q

Several issues of concern when interpreting validity coefficients (9)

A

(1) All validity coefficient don’t have the same meaning
(2) The conditions of a validity study are never exactly reproduced. E.g. If you take the GRE to gain admission to graduate school, the conditions under which you take the test may not be exactly the same as those in the studies that established the validity of the GRE.
(3) Criterion-related validity studies mean nothing UNLESS the criterion is valid and reliable.
(4) Validity study might have been done on a population that does not represent the group to which inferences will be made.
(5) Be sure the sample size was adequate
(6) Never Confuse the Criterion with the Predictor (GRE & success in grad school example)
(7) Check for Restricted Range on Both Predictor and Criterion: Correlation requires that there be variability in both the predictor and the criterion.
(8) Review Evidence for Validity Generalization (may not be generalized to other similar situations)
(9) Consider Differential Prediction: Predictive relationships may not be the same for all demographic groups.

53
Q

Differential Prediction

A

Predictive relationships may NOT be the same for all demographic groups.
-> The validity for men could differ in some circumstances from the validity for women.
-> Under these circumstances, separate validity studies for different groups may be necessary.

54
Q

Criterion-Referenced Tests

A

Have items that are designed to match certain specific instructional objectives. Designed to measure student performance against a fixed set of predetermined criteria.

55
Q

Validity studies for the criterion-referenced tests

A

Would compare scores on the test to scores on other measures that are believed to be related to the test.

56
Q

Validity & Reliability relationship

A

A measurement procedure MUST BE RELIABLE (consistent) in order to be VALID.
-> A measurement procedure can be reliable, but not necessarily valid.