Week 3 - 4: Reliability Flashcards

1
Q

The reliability coefficient varies between

A

0 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reliability refers to

A

the consistency of a measuring tool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Is reliability all-or-nothing or on a continuum?

A

Continuum. That is, a measuring tool or test will be more or less reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Two components that CTT assumes are present in an observed score

A

true score + measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

True score

A

the actual amount of the psychological characteristic being measured by a test that a respondent possesses.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Measurement error

A

the component of the observed score that does not have to do with the psychological characteristic being measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

According to CTT, reliability is the extent to which differences in respondents’ ______ scores are attributable to differences in their_________ scores, as opposed to __________ ___________

A

observed, true, measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

3 sources of measurement error

A
  1. Test construction 2. Test administration 3. Test scoring and interpretation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Examples of sources of measurement error in Test construction

A

• item sampling (variation among items in a test) • content sampling (variation among items between tests)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Example of sources of measurement error in test administration

A

• test environment (temperature, lighting, noise) • events of the day (positive vs. negative events) • test-taker variables (physical discomfort, lack of sleep) • examiner-related variables (physical appearance & demeanour)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Example of sources of measurement error in Test scoring and interpretation

A

• subjectivity in scoring (grey area responses) • recording errors (technical glitches)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reliability depends on two things:

A
  1. The extent to which differences in test scores can be attributed to real individual differences 2. The extent to which differences in test scores are due to error expressed as: Xo = Xt + Xe
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

2 key assumptions of CTT

A
  1. Observed scores on a psychological measure are determined by a respondent’s true scores plus measurement error 2. Measurement error is random—it is just as likely to inflate a score as to deflate it -error tends to cancel itself out across respondents -error scores are uncorrelated with true scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A reliability coefficient acceptable for research purposes

A

.7 or .8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A reliability coefficient needed for applied purposes

A

.9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Tau Equivalence

A

participants true scores for one test must be exactly equal to their true scores on the other test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Parallel tests must satisfy the assumptions of CTT as well as further assumptions which are

A
  1. participants true scores for one test must be exactly equal to their true scores on the other test—known as “tau equivalence” 2. the tests must have the same level of error variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

standard deviation of error scores tell us in “test score units” the…

A

the average size of error scores we can expect to find when a test is administered to a group of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The standard deviation of error is also known as

A

the standard error of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

the correlation between parallel test scores is equal to

A

the reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Thus, parallel forms of a test exist when, for each form, the observed scored means and variances are

A

the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Different content problem

A

Two forms of a test may meet the requirements of CTT, but not measure the same psychological attribute because they posses different content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

carryover effects examples

A

For example, a respondent’s memory for test content, attitudes, or mood state might similarly affect performance on both forms of a test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

According to CTT error scores on one form of a test should be ______________ with error scores on a second form of a test

A

uncorrelated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

4 assumptions of CTT and parallel tests

A

the observed scores on each form are the sum of the true scores and error scores • the true scores are the same for the two forms • the error scores for each form sum to 0 and have the same variance • true scores are uncorrelated with error scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How is test-retest reliability estimated

A

by correlating respondents test-retest scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The test-retest method depends on the same assumptions as the alternate forms method, these are (2):

A

1 people’s true scores should not change between the two testing occasions 2 the error variances of the two tests should be identical • The observed test-retest scores should therefore have the same means and variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

3 threats to the “true score stability” assumption

A

1 construct instability 2 length of test-retest interval 3 developmental changes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

test-retest correlation is sometimes known as

A

the coefficient of stability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

If the true scores change during the test-retest interval, then the reliability coefficient will reflect two factors:

A

1 the degree of measurement error 2 the amount of change in true scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

two factors that determine the internal consistency reliability of test scores:

A

1 The consistency among parts of a test: • if the test items are strongly correlated with each other, the test is likely to be reliable 2 The test’s length: • all things being equal, a longer test will be more reliable than a shorter test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

r four methods of estimating internal consistency

A

1 Split-Half Reliability 2 Coefficient α 3 Standardised Coefficient α 4 KR-20

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

three steps to computing the split-half reliability

A

1 Divide the test into equal halves 2 Calculate the correlation between scores on the two halves of the test 3 Adjust the half-test reliability using the Spearman-Brown formula

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Variance Covariance Matrix - The diagonal elements

A

The diagonal elements in the matrix are the “item variances”

35
Q

Variance Covariance Matrix -The off-diagonal elements

A

The off-diagonal elements in the matrix are the “inter-item covariances” (the associations between each item and every other item, as measured by covariance)

36
Q

coefficient α typically ranges in value from

A

0 to 1

37
Q

Coefficient α assumptions

A

1.The α method assumes that test items are essentially tau equivalent • each item is an equally strong indicator of the true score scores, but they may differ in their precision by a constant ( in other words, the items can have different means) 2 Items can have possibly different error variances 3 Error scores should be uncorrelated with true scores—error should be random (assumption for all forms or reliability 4 Coefficient α assumes that all items used to generate a composite score measure the same attribute or construct

38
Q

What level of α may be “too high” and indicate redundancy in the items

A

.9 or greater

39
Q

does coefficient α measure “unidimensionality”?

A

no

40
Q

What formula for internal consistency was made for determining the internal consistency reliability of composite scores based on dichotomously scored items

A

KR-2-, however cronbachs a works too

41
Q

why is a long test is more reliable than a short test

A

Increasing the length of a test by adding new items that measure the same construct as the original items will increase the true score variance more than the error variance

42
Q

All types of reliability are estimated:

A

quanitatively

43
Q

Reliability is a property of:

A

scores, not a test. Strictly speaking, a test is not found to be reliable.

44
Q

True score:

A

A true score is a hypothetical score devoid of measurement error.

45
Q

Observed Score:

A

Observed scores are the scores we obtain from tests or instruments.

46
Q

The discrepancy between observed scores and true scores is considered to be due to

A

measurement error.

47
Q

Error Scores assumptions:

A

Error scores should have a mean of zero. Error scores should be a random process Error scores should be uncorrelated with true scores

48
Q

Four ways to think about reliability:

A
  1. The ratio of true score variance to observed score variance 2. Reliability as a lack of error vairance 3. Reliability as the (squared) correlation between observed scores and true scores 4. Reliability as the lack of (squared) correlation between observed scores and error scores.
49
Q

Interpretation guidelines for reliability: Unnacceptably low: Minimum for beginning stage research: Good level for research purposes: Necessary for important decisions:

A

.60 .70 .80 .90

50
Q

The ratio of true score variance to observed score variance

A

This conceptualisation is similar to eta squared: the ratio of SSEffect to SSTotal

Conceptually, in the reliability case, it is the ratio of SSTrue to SSObserved

51
Q
  1. Reliability as a lack of error variance
A

Instead of the ratio of true score variance to observed variance, in this case we speak of the ratio of error variance to observed variance.

We subtract this ratio by 1 to place in the same context of reliability (rather than error).

52
Q
  1. Reliability as the (squared) correlation between observed scores and true scores
A
53
Q
  1. Reliability as the lack of (squared) correlation between observed scores and error scores.
A

If reliability is the correlation between true scores and observed scores, then it is necessarily the case that it is the relative absence of a correlation between observed scores and error scores.

54
Q

Parallel Tests

A

Essentially, two tests are considered parallel if they are identical to each other psychometrically, but differ in the actual items that make up each test.

55
Q

All tau-equivalence assumptions:

A

Tau-equivalence, in this context, implies that the true scores associated with each test represent the same construct.

Thus, a person’s true score on one test would be expected to be identical on the other test.

Plus, assumes equal error variances between the two tests, as well.

56
Q

According to CTT, the correlation between the composite scores on Test 1 and the composite scores on Test 2:

A

represent the reliability associated with the scores. Thus, the closer the correlation is to 1.0 the more reliable we consider the scores to be. If the correlation is very high, it is telling us that the test scores represent “something” in a very precise way.

57
Q

Two sources of information can help us evaluate an individual’s test score

A

1 a point estimate: a “best estimate” of a person’s true score
2 a confidence interval: the range in which the true score is likely to fall

58
Q

Point estimates and confidence intervals are directly affected by the test score ______________ ________________

A

reliability coefficient

59
Q

two kinds of point estimates of a person’s true score that can be computed from a person’s observed score:

A

1 An individual’s observed test score
2 An adjusted true score estimate

60
Q

A point estimate based solely on a person’s observed score on a test fails to account for

A

measurement error

The second point estimate—known as an adjusted true score estimate—takes such measurement error into account

61
Q

The adjusted true score estimate reflects an effect called

A

regression to the mean

62
Q

The adjusted score estimate reflects the discrepancy in an individual’s observed score that is likely to arise between two testing occasions. The size and direction of this discrepancy is a function of three factors:

A

1 the size of the reliability coefficient
• Poor reliability produces bigger discrepancies between the estimated true score and the observed score

2 the size of the difference between an individual’s observed test score and the mean
• The difference between the estimated true score and the observed score will be larger for relatively extreme observed scores (high or low) than for relatively moderate scores

3 the direction of the difference between an individual’s observed test score and the mean (whether the score was above or below the mean)

63
Q

The adjusted score estimate is the best estimate of

A

a predicted true score

64
Q

Confidence intervals reflect

A

the precision of the point estimate of an individual’s true score

65
Q

Point estimates of an individual’s true score are usually reported with

A

true score confidence intervals

66
Q

Confidence intervals are constructed using the

A

standard error of measurement

67
Q

The sem is the __________ ______________ of a theoretically normal distribution of test scores obtained by one person on equivalent tests

A

standard deviation

68
Q

In accordance with CTT, an observed test score is one point in the theoretical _________ of ____________ the test-taker could have obtained

A

distribution, scores

69
Q

The sem allows us to estimate, with a specific level of confidence (typically 95%),….

A

the range in which the true score is likely to exist

70
Q

To use the sem to estimate the confidence interval of the true score, we make an assumption

A

If the individual were to take a large number of equivalent tests, scores on those tests would tend to be normally distributed, with the individual’s true score as the mean

Since the sem functions like a standard deviation in this context, we can use it to predict what would happen if an individual took additional equivalent tests

71
Q

Approximately _______ of the scores would be expected to occur within ±1sem of the true score

A

68%

72
Q

Approximately _____ of the scores would be expected to occur within ±2sem of the true score

A

95%

73
Q

Approximately _____ of the scores would be expected to occur within ±3sem of the true score

A

99%

74
Q

Suppose an individual obtained a score of 50 on one spelling test and that test had a sem of 4, then using 50 as the point estimate we can be 68% (±1sem) confident that the true score falls between

A

46 and 54

75
Q

95% confidence interval formula

A

95% CI = Xo ± (z95%)(sem)

76
Q

z95% is the z score from a normal distribution table corresponding to a score below which 95% of the area of the normal distribution, this equals to

A

1.96

77
Q

z68% =

A

1

78
Q

z75% =

A

1.15

79
Q

z85% =

A

1.44

80
Q

Highly reliable tests will produce________confidence intervals than less reliable tests

A

narrower

81
Q

According to CTT, the correlation between the observed scores on two measures (rxo yo ) is determined by two things:

A

1 the correlation between the true scores on the two psychological constructs being assessed by the measures (rxt yt ) and 2
the reliabilities of the two measures (Rxx, Ryy)

82
Q

observed associations (i.e., between measures) will always be weaker than true associations because

A

of measurement error

83
Q

it is possible to estimate the true association between a pair of constructs by employing a formula known as

A

the correction for attenuation