Psychometrics: reliability Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is a reliable test?

A
  • consistency in measurement

- the precision with which the test score measures achievement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is reliability

A
  • the desired consistency or reproductibility of test scores (does it give me the same accurate measurement each time it is used?)
  • no test is free from error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reliability formula

A

x=T+e

x- the observed score
T- the true score
e- the error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The Four Assumptions of Classical Test Score Theory

A
  1. Each person has a true score we could obtain if there was no measurement error
  2. there is measurement error- but this error is random
  3. the true score of an individual doesnt change with repeated applications of the same test, even though their observed score does
  4. the distribution of random errors and thus observed test scores with be the same for all people
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Standard Error of measurement (SEM)

A

-works out how much measurement error we have by working out how much on average, an observed score on our test differs from the true score
(standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Problems with Classical Test Score Theory

A
  1. Population dependent
  2. Test dependent
  3. Assumption of equal error measurement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Domain Sampling Model

A
  • a central concept of Classical Test Theory
  • cant ask all possible questions on a test so only use a few test items (sample)
  • using fewer test items can lead to the intro of error
  • as sample gets larger, estimate is more accurate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

4 Types of reliability

A
  1. Test-retest reliability
  2. Parallel forms reliability
  3. Internal consistency
  4. inter-rater reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Test-retest reliability

A
  • give someone a test and then give them the same test later on
  • if scores are highly correlated, we have a good test-retest reliability
  • correlation between 2 scores = co-efficient of stability
  • time sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Issues with test-retest

A
  • . can it be used when measuring mood/stress?
  • scores increase because done them before
  • if thing being measure changes?
  • what if an event happens between tests administrations to change the thing being tested?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Parallel forms reliability

A
  • 2 forms of the same test (questionnaire with different items)
  • correlation between the two = co-efficient of equivalence
  • item sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ways to change test in parallel forms reliability

A
  • question response alternatives are reworded
  • order is changed
  • change wording of question
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Issues with parallel forms reliability

A
  • what if different forms are given at two different times?
  • do you give the form to the same or different people?
  • what if people work out how to answer the one form from doing the other form?
  • do you have two forms of the test and/or do we want to develop two forms of the same test?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Internal Consistency

A

-do different items within a test all measure the same thing, to an extent?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Examples of internal consistency tests

A
  • split-half reliability
  • KR20
  • coefficient alpha
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Split-half reliability

A
  • test split in half and each half scores separately

- total scores for each half are correlated

17
Q

advantage of split-half reliability

A

-only need one test (dont need 2 forms)

18
Q

challenge of split-half reliability

A

-how to divide the test into equivalent halves

19
Q

issues with split-half reliability

A
  • by splitting test, have less items and the lower the reliability will be
  • correlation changes each time depending how items are split
20
Q

Spearman-Brown formula

A

is the solution to the problem for split tests- that each half will have reduced reliability compared to the total test)

21
Q

Coefficient/Cronbach’s Alpha

A
  • estimates the consistency of responses to different scale items
  • takes the average of all possible split-half correlations for a test
22
Q

What do coefficients results mean? Cronbach’s A

A

0- no consistency in measurement

1- perfect consistency in measurement

23
Q

what level of reliability is appropriate? Cronbach’s A

A
  1. 7 - exploratory research
  2. 8 - basic research
  3. 9 - applied scenarios
24
Q

Cronbach’s alpha can be affected by

A
  1. multidimensionality
  2. bad test items
  3. number of items
25
Q

Inter-rater reliability

A
  • measures how consistently 2 or more judges agree on rating something
  • by correlating raters scores
26
Q

Cohen’s kappa

A

-2 judges/raters
-ranges from 1 (perfect agreement) to -1 (agreement less than would be expected by chance)
>0.75 - excellent agreement
0.4-0.7 - satisfactory

27
Q

Fleiss’ kappa

A

for 2 or more judges/raters

28
Q

Intra-class correlation (ICC)

A

used for inter-rater reliability when rating interval and ordinal measurements

29
Q

ICC vs COhen/Fleiss kappa

A
  • ICC for continuous data (interval and ordinal)

- kappa for observations in a category (nominal/categorical data)

30
Q

SEM calculation

A

SEM= S(sqrt 1-r)
s- stdev
r- reliability of test

31
Q

confidence intervals using SEM

A
  • z score for 95% confidence interval= 1.96
  • lower bound = x- 1.96*SEM
  • upper bound =x+ 1.96*SEM
32
Q

Factors influencing reliability

A
  1. number of items in scale
  2. variability of the sample (better with wider population)
  3. extraneous variables (testing situation, ambiguous items, unstandardised procedures, perceived demand effect)
33
Q

how to improve reliability

A
  1. item analysis
  2. Use identical instructions
  3. Eliminate questions that evoke inconsistent responses
  4. Cover entire range of the dimension
  5. Clear conceptualization
  6. Standardization
  7. Inter-rater training
  8. Use more precise measurement
  9. Use multiple indicators
  10. Pilot-testing