Reliability and Validity Flashcards

Question 1

Q

What is a psychometric test?

Answer

A

A standardised test that uses psychological measurements to quantify a person’s ability, strengths or characteristics

Question 2

Q

What does a psychometric test consist of?

Answer

A

One or more stimuli that people respond to. Responses can be overt (e.g., key press) or covert (e.g., skin conductance response)

Question 3

Q

How are psychometric tests standardised?

Answer

A

Data is collected from from a large number of people to establish norms, including cut-offs. This allows identification of individuals outside or inside the population norms.

Question 4

Q

What are two key properties of a good psychometric test?

Answer

A

Reliability (internal properties of the scale) and validity (external properties of the scale)

Question 5

Q

What is reliability?

Answer

A

Consistency/stability of a measure across:
- Time
- Setting
- Individuals
‘Quality’ of measurement (the extent to which you can measure a person’s ‘true’ score each time)

Question 6

Q

What is True Score Theory?

Answer

A

A person’s ‘true’ score reflects their genuine ability, characteristics, or potential. Every observed score is made up o the true score plus some measurement error.

Question 7

Q

What are the two sources of variability (error) in observed scores?

Answer

A

Variability in ‘true’ scores within a population (individual differences)
Fluctuations in measurement error (systematic and random)

Question 8

Q

What are the two types of measurement error of reliability?

Answer

A

Systematic error
Random error

Question 9

Q

How does systematic error affect reliability?

Answer

A

Consistently affects measurement (bias)
- Fire alarm during exam
- Race/gender bias on the test items
- Driving errors equipment has faulty controller

Systematic errors alter the mean of the data: shift
(everyone shifts in the same direction)

Question 10

Q

How does random error affect reliability?

Answer

A

Random variations are not consistent across the sample (noise/chance)
- Test takers might be hungry, tired, nervous, etc., but not everyone taking the test might be in the same state
Random variations increase the variability of the data
(+ and - to the true score)

Question 11

Q

What is reliability determined by?

Answer

A

how much error variance is in the measurement

Question 12

Q

How do we calculate reliability?

Answer

A

variability of true score / variability of observed score

*greater denominator = less reliable

Question 13

Q

What are the types of reliability?

Answer

A

Test-retest
Alternate/parallel forms
Internal consistency (including co-efficient alpha, split-half/inter-item reliability, and item-total reliability)
Inter-rater agreement

Question 14

Q

What are the types of validity?

Answer

A

Construct validity (including content and face validity)
Criterion validity (including predictive, known groups, convergent and discriminant validity)

Question 15

Q

What is predictive validity?

Answer

A

The extent to which data from a measure can predict something it should theoretically be able to predict. For example, does a high extraversion score predict the number of friends someone has?

Question 16

Q

What is convergent validity?

Answer

A

The degree to which multiple measures of the same construct show similar results. For example, do self-report and clinician-rated measures of depression show similar scores?

Question 17

Q

What is discriminant validity?

Answer

A

The extent to which a measure does NOT relate to constructs it should NOT relate to. For example, a measure of verbal validity should not be strongly correlated with measures of athletic ability.

Question 18

Q

What are the two types of errors in measurement?

Answer

A

Systematic error (bias)
Random error (noise/chance)

Question 19

Q

What is systematic error?

Answer

A

Error that consistently affects measurement. Examples include a fire alarm during an exam or race/gender bias in test items. Systematic errors shift the mean of the data in the same direction fr everyone

Question 20

Q

What is random error?

Answer

A

Random variations in scores that don’t have a consistent effect across the sample. Examples include test-takers being hungry, tired or nervous. Random error increases the variability of the data.

Question 21

Q

What is test-retest reliability?

Answer

A

A type of reliability assessed by administering the same test to the same people on multiple occasions and calculating the correlation between scores. A higher correlation indicates higher reliability.

Question 22

Q

What considerations are important for test-retest reliability?

Answer

A

The time interval between tests is important. Longer intervals allow more room for natural change, while shorter intervals can lead to carryover effects. Test-retest reliability is only useful for stable characteristics.

Question 23

Q

What is alternate/parallel forms of reliability?

Answer

A

A type of reliability assessed by developing multiple versions of a test, administering each version to the same participants at multiple times, and correlating the results. A higher correlation indicates better reliability.

Question 24

Q

What is split-half reliability?

Answer

A

A type of reliability assessed by administering a test to participants, splitting the test in half, and computing the correlation between the two halves.

Question 25

Q

What are the problems with split-half reliability?

Answer

A

The way that the test is split is important, as different splits can yield different reliability values. Reliability is also reduced because the number of items in each half is smaller.

Question 26

Q

What is coefficient alpha?

Answer

A

A measure of internal consistency reliability that assesses the correlation of each test item with other items. The most common method is Cronbach’s alpha.

Question 27

Q

What is a problem with coefficient alpha?

Answer

A

It can be sensitive to test size, as more items can lead to higher reliability estimates even if the items are poorly inter-correlated.

Question 28

Q

What are the two main types of validity?

Answer

A

Construct Validity (how well a measure reflects the intended construct) and criterion validity (how well a measure relates to concrete, observable criteria)

Question 29

Q

What are the subtypes of construct validity?

Answer

A

Content validity
Face validity

Question 30

Q

What is content validity?

Answer

A

The extent to which the content of each test item measures the intended construct

Question 31

Q

What is face validity?

Answer

A

A superficial measure of whether the test appears to measure the intended construct. It can be important because it can affect how respondents approach a test.

Question 32

Q

What is known-groups validity?

Answer

A

The extent to which a measure differentiates between groups who should theoretically perform differently on it

Question 33

Q

What are some good practices for constructing valid test items?

Answer

A

Use ambiguous language
Avoid items that might cause response bias
Use a blend of positively and negatively keyed items
Ensure items assess all aspects of a construct

Question 34

Q

What is social desirability and why is it important to consider in test reconstruction?

Answer

A

The tendency for people to want to present themselves in a positive light. Items that no one wants to rate themselves high r low on tend to be poor items because they fail to capture the range of variability in the construct. These items can be high in reliability but low in validity.