- Extent to which a test measures what it claims to measure in a specific context - Are the inferences appropriate?

How valid a test looks to those involved - Judgement concerning item relevance Judgment comes from test taker and not test user - Lack of FV may lead to lack of confidence in the test's effectiveness - FV is not correlated to psychometric soundness

- How adequately a test samples behavior it is designed to sample - Like FV except you are attempting to cover every possible aspect of what you are trying to measure - Content validity heavily stresses comprehensiveness Ex: Does a comprehensive final cover topics presented during a course?

- Expert opinion - Test develper creates many potential items - Experts rate each items -- essential -- useful, but not neccessary -- not necessary - Remaining items are ranked and ordered

Ch 6 Validity Flashcards by Peyton Williams

Validity

Extent to which a test measures what it claims to measure in a specific context
Are the inferences appropriate?

How well did you know this?

Not at all

Perfectly

Tests are not universally valid but can still measure…

particular purpose
particular population of people
particular time

How well did you know this?

Not at all

Perfectly

Validity is only valid in…

particular circumstances

How well did you know this?

Not at all

Perfectly

Validation

Gathering and evaluating validity evidence

How well did you know this?

Not at all

Perfectly

Whose responsibility to check for validity?

Test developers
Test users
Local validation studies (Necessary when test is altered in some way). (Also necessary when used on a different population)

How well did you know this?

Not at all

Perfectly

Face Validity (FV)

How valid a test looks to those involved

Judgement concerning item relevance

Judgment comes from test taker and not test user

Lack of FV may lead to lack of confidence in the test’s effectiveness
FV is not correlated to psychometric soundness

How well did you know this?

Not at all

Perfectly

Content

The actual items that make up the test (specific questions)

How well did you know this?

Not at all

Perfectly

Content Validity

How adequately a test samples behavior it is designed to sample
Like FV except you are attempting to cover every possible aspect of what you are trying to measure
Content validity heavily stresses comprehensiveness

Ex: Does a comprehensive final cover topics presented during a course?

How well did you know this?

Not at all

Perfectly

Content Validity

Expert opinion
Test develper creates many potential items
Experts rate each items

– essential
– useful, but not neccessary
– not necessary

Remaining items are ranked and ordered

How well did you know this?

Not at all

Perfectly

Content Validity Ratio

Percentage of items judges agree are essential

How well did you know this?

Not at all

Perfectly

Relativity of content validity

A measure that is content valid in one culture may not have content validity in another culture

Ex: A measure of content validity for depression in Western culture should be different from a measure of content validity for depression in Eastern cultures

How well did you know this?

Not at all

Perfectly

Criterion

Standard against which a test or score is evaluated

Ex: Mental disorder diagnosis of depression, anxiety, etc.

How well did you know this?

Not at all

Perfectly

Criterion - Related Validity

How adequately a score can be used to infer an individual’s standing on some criterion
Measured with validity coefficient

– Correlation between test score and score on criterion measure

Ex: Correlation between depression score and DSM-5 depression diagnosis

How well did you know this?

Not at all

Perfectly

Criterion - Related Validity Two Types [Placeholder]

How well did you know this?

Not at all

Perfectly

Concurrent

Test scores & criterion measure at same time

ex:
- New diagnostic tool * Existing Diagnosis

Test A * Test B

How well did you know this?

Not at all

Perfectly

Predictive

Study These Flashcards

Test scores, then criterion measure taken in future
Ex: Act score to predict freshman GPA (ACT was taken in the past, GPA is the criterion)
A test with high predictive validity useful

Incremental Validity (A type of predictive validity)

Study These Flashcards

Degree to which an additional predictor explains something else about the criterion

Ex: High School GPA added to the above example

Criterion Characteristics [placeholder]

Study These Flashcards

Uncontaminated [skip]

Study These Flashcards

Criterion Contamination

Study These Flashcards

The criterion is based on the predictor (I.e., the criterion is on the test)
Ex: Develop a depression measure based on BDI (Beck Depression Inventory) and then test it on patient diagnoses that were based on BDI
Relevant
Valid (ratings/tests used as criteria must be valid)

Standard Error of Estimate (SEE)

Study These Flashcards

Margin of error expected in the predicted criterion score
Related to the correlation between the test score and criterion

Standard Error of Measurement (SEM)

Study These Flashcards

For reliability

Standard Error of Estimate (SEE)

Study These Flashcards

For Validity

When the measure tries to predict the criterion

Construct

Study These Flashcards

informed, scientific idea hypothesized to describe or explain behavior

Ex: intelligence, leadership

Construct Validity

- Appropriateness of inferences drawn from test scores regarding standing on a construct - Referred to as "Umbrella Validity"

Evidence of Construct Validity

- Evidence of homogeneity - How uniform a test is in measuring a single concept - Evidence of changes with age - Some constructs are expected to change over time (Reading rate) (vs. Marital satisfaction) - Evidence of pretest/posttest changes - test scores change as a result of some experience between a pretest and a posttest (therapy) - Evidence from distinct groups - Scores on a test vary in a predictable way as a function of membership in some group (scores on the psychopathy checklist for prisoners vs. civilians)

To test for construct validity, look to the theory

- No one way to test, solely based on your construct and the theory behind your construct - Does it theorize that your construct is a single construct? - Does it theorize it should look differently between people who have been treated and not treated? - Does it theorize it should change over time?

Other Forms of Evidence

Your test scores should correlate with other "tried and true" tests of the same thing

Convergent Validity

A test that correlates highly with other tests of the same construct

Should your tests correlate with scores of other non-related variables?

Discriminant (Divergent) Validity

A test does not correlate with other tests of different constructs

Factor

Characteristics, dimensions, or attributes that people differ on

Factor Analysis

- A family of methods that classifies several items into groups of related factors or latent variables - Done by finding similarities among the items - Most often used in survey research to see if a long series of questions can be grouped into smaller sets of questions

Two types of Factor Analysis [placeholder]

Exploratory

You do not have a hypothetical model and let data guide the factors

Confirmatory

You have a hypothetical model and test that

Reliability - Validity Relationship

Without strong reliability a test cannot be valid; however, a test can be reliable without strong validity - Attenuation is the weakening of validity due to poor reliability

Ch 6 Validity Flashcards

(37 cards)