Lecture 4: Essential of Reliability Flashcards

1
Q

Reliability

A
  • suggest trustworthiness
  • quality of test that suggest they are sufficiently consistent and free from measurement error
  • consistency and precision of the results of the measurement process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Measurement error

A

any fluctuation in scores that results from factors related to the measurement process that are irrelevant to what is being measured
- reliable scores should be free of measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

treu score

A
  • hypothetical entities that would result from error-free measurement
  • goal of reliability analysis: to estimate true scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Individual’s true score

A

the average score in a hypothetical distribution of scores that would be obtained if the individual took the same test an infinite number of times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

observed score

A

derived from tests (= scores that the individuals actually obtain)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

any observed score (X0) is made up of two components

A
  • the true score component

- the error score component

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

True score component (Xtrue)

A

is construed to be that portion of observed score that reflects whatever ability, trait, or characteristic the test assesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

error score component (Xerror)

A

difference between the observed score and the true score

- any other factor that may enter into the observed score as a conseqeunce of the measurement process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
sample variance 
(true scores in group data)
A

the average amount of variability in a group of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

sample variance consist of (two components)

A
  • a portion that is true variance

- a portion that us error variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True variance

A

differences among the scores of individuals within a group that reflect their standing or position in whatever characteristic the test assesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

error variance

A

differences among test scores that reflect factors irrelevant to what the test assesses
- reliability scores increases as the error component decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reliability coefficient (reliability)

A

defined as the ration of true score variance to total test score variance
- if test score variance = true variance (reliability = 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two-step process (Evaluation of reliability)

A
  1. What are possible sources of error?

2. What is the magnitude of those errors?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The relativity of reliability

A
  • tests cannot be reliable, test scores are reliable!!

- score might be unreliable (due to test taker, testing situation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

3 sources of error which can enter the test score

A
  • Context in which tésting takes place
  • test taker
  • specific characteristics of the test itself
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Random measurement error vs. systematic measurement error

A
  • some of the errors can be minimized (due to proper testing practice etc.)
  • other cannot be eliminated but may be detected by variozs types of checks built into the test
18
Q

Sources of error

A
  1. Interscorer difference
  2. Time sampling error
  3. Content sampling error
  4. Interim inconsistency
  5. Interim inconsistency and content homogeneity
  6. Time and content sampling error
19
Q

Interscorer difference

A
  • errors entering into scores whenever the element of subjectivity influences scoring
  • refers to the variations in scores that stem from differences in the subjective judgements of the scorers
20
Q

Scorer Reliability

A
  • method for estimating error due to interscorer differences
  • 2 independent scorers (two independent scores are generated)
  • correlation between the set of scores
    (for metric variables)
21
Q

Time sampling error

A

variability in test scores as a function of the fact that they are obtained at one point in time rather than at another

22
Q

Concept of time sampling error

- hinges on two related notion

A
  1. Construct/behavior is liable to fluctuate in time

2. Construct/behaviors chnage at different paces in time

23
Q

Test-retest reliability

A
  • test is administered twice on two different occasions to one or more groups of individuals
  • correlation between the scores obtained from the two administrations
    = test-retest reliability coefficient
  • crucial: length of time interval!
24
Q

Content sampling error

A

term used to label the trait-irrelevant variability that can enter into test scores as a result for fortuitous factors related to the content of the specific items included in a test

25
Content sampling error can be due to..
1. faulty test constructions | 2. specific content which favors some test takers
26
alternate-form reliability
- intended to estimate the amount of error in test scores that is attributable to content sampling error - two or more forms of a test (different in speciifc content) need to be prepared and administered to the same group of subjects - scores are correlated (alternate-form reliability)
27
Split-half reliability
- administering a test to a group of individuals and create two scores for each person by splitting the test into two halves - the scores of the two halves are then correlated (split-half reliability coefficient)
28
Spearman-Brown (S-B) formula
- based on the notion of all things being equal, ascore based on a longer test will be closer to the true test score than one based on a shorter test - the formula estimates the effect
29
Spearman Brown formula (does what?)
the formula estimates the effect: that lengthening a test by any amount, or shortening a test to any fraction of its original size, will have on the obtained coefficient
30
Some solutions to the Problem of How to split a test in halves ...
1. odd even split, or two halves | 2. for speed tests: two-trial reliability
31
Interim inconsistency
error in scores that results from fluctuations in items across an entire test (low correlations among test items)
32
What is interim inconsistency due?
1. content sampling | 2. Content heterogeneity
33
Content heterogeneity
inclusion of items or sets of items that tap content knowledge or psychological functions that differ from those tapped by other items in the same test (only when the test should be homogenous)
34
Internal consistency measures
statistical procedures designed to assess the extent of inconsistency across test items (split-half reliability coefficients accomplish this to some extent) - formulas that take into account the interim correlation
35
interim correlation
the correlation between performance on all the items within a test
36
Kudar Richardson formular (KR20) and coefficient alpha (cronbachs alpha) - function of two factors
- number of items in the test | - the ratio of variability in test taker*s performance across all the items in the test to total test score variance
37
most frquently used formulars to calculate interim consistency
``` Kudar Richardson formular (KR20) coefficient alpha (cronbachs alpha) ```
38
Kudar- Richardson formula
Applied to test whose items are scored as right or wrong (dichotomous) - dependent on the interim variability within a test
39
Coefficient alpha (cronbach's alpha)
used for tests whose items have multiple possible responses | - dependent on the interim variability within a test
40
Hetero vs. Homogeneity (Used in reference to the composition of ...)
1. the behavior samples (items) of a test | 2. group of test takers
41
Time sampling an content sampling error combined
both can be estimated in a combined fashion for tests which require stability and consistency as results
42
Delayed Alternate-Form Reliability
these coefficients can be calculated when two or more alternate forms of the same test are administered on two different occasions - addtional source of error: practise effects!