1. face validity 2. content validity 3. criterion validity 4. construct validity

- when a test seems on the surface to measure what it is supposed to measure - can have a good face validity but not really be a valid test

- the degree to which a test measure an intended content area - non-statistical - Do the questions/items on a test make up a representative sample of the attribute the test is supposed to measure

- how well a test score estimates/predicts a criterion behaviour or outcome, now or in the future - eg. depression inventory - easy for ability tests but hard for personality/attitude tests

Psychometrics: validity Flashcards by Abbie Chetwin

What is validity?

refers to whether or not a test measures what it intends to measure

How well did you know this?

Not at all

Perfectly

aim of establishing validity

to be able to make accurate inferences from scores on a test and to give meaning to test scores
-indicates the usefulness of a test

How well did you know this?

Not at all

Perfectly

relationship between validity and reliabiliy

if a test is not valid, no point in testing reliability

-if a test is not reliable, it is not valid

How well did you know this?

Not at all

Perfectly

4 types of validity

face validity
content validity
criterion validity
construct validity

How well did you know this?

Not at all

Perfectly

Face Validity

when a test seems on the surface to measure what it is supposed to measure
can have a good face validity but not really be a valid test

How well did you know this?

Not at all

Perfectly

how face validity is measured

researchers simply look at the items and give their opinion if the items appear to measure what they are trying to.
least scientific

How well did you know this?

Not at all

Perfectly

4 sectors of evaluating face validity

readability
layout and style
clarity of wording
feasability

How well did you know this?

Not at all

Perfectly

disadvantages of face validity

many dont consider this is a measure of validity at all
does not refer to what is actually being measured rather than what it appears to measure
determined through review and not statistical analysis

How well did you know this?

Not at all

Perfectly

Content validity

the degree to which a test measure an intended content area
non-statistical
Do the questions/items on a test make up a representative sample of the attribute the test is supposed to measure

How well did you know this?

Not at all

Perfectly

how to reach content validity

Specifying the content area covered by the phenomenon when developing the construct definition
Writing questionnaire or scale items that are relevant to each of the content areas
Developing a measure of the construct that includes the best (most representative) items from each content area

How well did you know this?

Not at all

Perfectly

construct under-representation (aspect of content validity)

the test does not capture nb components of the ocnstruct

How well did you know this?

Not at all

Perfectly

construct irrelevant-variance (aspect of content validity)

when test scores are influenced by things other than the construct the test is supposed to measure

How well did you know this?

Not at all

Perfectly

How is content validity established?

judgement by expert judges
-content validity=number of relevant items/total number of items
can also use statistical methods like factor analysis

How well did you know this?

Not at all

Perfectly

Criterion validity

how well a test score estimates/predicts a criterion behaviour or outcome, now or in the future
eg. depression inventory
easy for ability tests but hard for personality/attitude tests

How well did you know this?

Not at all

Perfectly

Why would we be interested in using criterions to create a new measurement procedure?

Create a shorter version of a well-established measure
To account for a new context, location and/or culture
To help test the theoretical relatedness of a well-established measurement procedure

How well did you know this?

Not at all

Perfectly

concurrent validity

Study These Flashcards

the extent to which test scores can correctly identify the current state of individuals
Measure concurrent criterion validity by correlating scores on our new test to scores on an already established test

predictive validity

Study These Flashcards

do scores on a test predict a future event successfully?
the test is the predictor
the future event is the criterion

How is Criterion Validity Evaluated?

Study These Flashcards

correlation coefficients
coefficient of determination (the square of the validity coefficient)
Standard error of estimate (SEE)- high see= greater deviation of criterion scores from predicted criterion scores (bad)
success ratio (SR) - the proportion of predicted successes on the criterion that turned out to actually be successful

Construct validity

Study These Flashcards

-It is something that we think exists, but is not directly observable or measurable
e.g., we can directly measure 10ml of water – water is directly observable and measurable
BUT we cannot directly measure 10ml of depression – depression is a construct, it is not directly observable and measurable

How do we measure constructs?

Study These Flashcards

we look at the relationship between the construct and other constructs
What observable behaviours can we expect if a person has a high (or low) score on a test measuring this construct?

The relationships between one construct and others

Study These Flashcards

look for convergent validity evidence AND divergent/discriminant validity evidence
for a test to have good validity, it needs to have both convergent AND discriminant validity evidence

Convergent validity

Study These Flashcards

Scores on a test have high correlations with other tests that measure the similar constructs
e.g., Depression tests should correlate highly with tests of sadness, or anxiety

Discriminant validity (divergent)

Study These Flashcards

Scores on a test have low correlations with other tests that measure different constructs
e.g., A questionnaire on racism should have little or no correlation with gender, for example

Criterion-groups validity

Study These Flashcards

groups that are expected to differ should score differently on tests
e.g., people with autism should score differently on scale of empathy than those with high empathy (e.g. counsellors)

Validity for Criterion-Referenced Tests

- Criterion-referenced tests compare performance with some clearly defined criterion for learning - Are often ‘high stakes’ tests – e.g., pass a test before you can practice in some discipline, like being an electrician etc. - Measure proficiency in something – this ranges from no proficiency at all to perfect proficiency

Establishing Validity for Criterion-Referenced Tests

1. Compare scores on a test before and after the program of instructions 2. Compare scores on the test with scores on a test related to the criterion

factors affecting validity

1. reliability (Can have reliability without validity BUT must demonstrate reliability before validity) 2. social diversity (tests may not be equally valid for different social/cultural groups) 3. variability

Psychometrics: validity Flashcards

(27 cards)