Week 2 - Reliability Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Most common misconception about reliability and validity

A

They are not properties of the test themselves but the properties of the test in a particular situation or for a particular purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define reliability

A

The consistency with which a test measures what it purports to measure in any given set of circumstances. If something is reliable it can be depended on.

Does the test produce consistent responses?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is social desirability bias?

A

A form of method variance common to the construction of psych tests of personality that arises when people respond to questions that place them in a favourable or unfavourable light.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The domain-sampling model

A

The test or assessment device draws from a larger set of items to give a score, therefore the score is an estimate.
If all possible questions had been asked, we would have the true position.
Thus, reliability becomes a question of sampling test items from a domain of all possible items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Standard error of measurement

A

an expression of the precision of an individual test score as an estimate of the trait as it purports to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Reliability coefficient

A

an index - usually Pearsons’ r - of the ratio of true score to error score variance in a test given in a set of circumstances.
The proportion of observed score variance that is due to true score variance. 0.5 can be the minimum level of reliability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the oldest way to calculate reliability of a test?

A

Two forms of the same test, see if the different items agree in the scores they yield.

  • Minimises practice effects.
  • However, If they lead to different scores, then one (but we don’t know which one) cannot be depended on.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Split-half reliability?

A

Split the test in half to compare scores. e.g. odd numbers scores compared to even numbers. By correlating the numbers from the two tests (with a high enough sample of ppts) you get an estimate of reliability.
(When you have larger samples, just use the whole test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Pros and cons of split half reliability?

A

With odd even number method, fatigue effects are the same for both halves of the test.
But for speeded tests (time limit), it is not recommended.

But odd even method is arbitrary and different methods of splitting have different pros and cons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to work out Cronbach’s alpha?

A

Split the test into subtests, of 1 item each. Correlate all subtests with all other subtests and the average correlation is reliability.
i.e. ‘internal consistency’ of a test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cons of Cronbach’s alpha? (4)

A
  • Tests with high internal consistency can just have items with similar content.
  • Although faithfully sampling a domain, the domain might be trivial.
  • High internal consistency does not mean the test is measuring the thing. The items might be interrelated but not homogenous/unidimensional.
  • If there are multiple factors (traits) underlying performance on a test, alpha can overestimate the reliability the factor thought to underlie the test.
    So Confirmatory Factor Analysis might be better.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is test-retest reliability?

A

the estimate of reliability obtained by correlating scores on the test obtained on two or more occasions of testing The stronger the correlation, the more reliable.

Important for retesting patients to see if they are getting worse etc. If the test will drift over time, they need a more reliable test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Generaliseability theory?

A

Cronbach 1972 said in obtaining scores from a test, the user seeks to generalise beyond the particular score to some wider universe of behaviour. Users must SPECIFY the desired range of conditions over which this is to hold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is inter-rater reliability? What is the best method of obtaining it?

A

Correlation scores across different judges or raters. Consistency between raters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How reliable does a test need to be?

A

VERY, if the test has serious consequences for the individual. But if it’s still being developed, a lower level of reliability will suffice.
Nunnally (1967) rule of thumb, 0.5 or better for a test developer, 0.7 or better for research, and better than 0.9 for individual assessment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which domains have most reliable tests?

A

Cognitive abilities, followed by personality tests followed by projective techniques (criticized for being psychometrically evaluated at all)

17
Q

What are parallel/alternate forms of reliability?

A

the estimate of reliability of a test obtained by comparing two forms of a test constructed to measure the same construct, same pool of items. For eliminating practice effects on the same group doing the test more than once.

18
Q

What can you do to increase reliability?

A

Add more items, although this is not linear. Double items does not double reliability. For inter-rater, improve training of the raters about the characteristic being judged and the meaning of the points on the rating scale being used.

19
Q

Internal consistency

A

How well each item is correlated with each other item; whether items are measuring the same thing.

Correlation between performance on each item and overall performance. Cronbach’s a.

20
Q

Confidence and reliability (3)

A
  • The acceptable range of reliability depends on the variable being measured (see Field, 2013, p. 709)
    » Unstable (dynamic) aspects are less reliable than stable (static) traits
    » Lower levels of reliability = less confidence in the test data