L2: Classical Test Theory Flashcards

1
Q

What is the central statistic in classical test theory?

A

The sum score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True Score - Definition

A

The score that would be obtained if a perfect measurement instrument was used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classical Test Theory - Core Assumptions

A

1) Observed score = true score + measurement error
2) Measurement error is random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the implications of measurement error being random

A

1) mean m.err = 0
(it cancels out bc it is random)
2) there is no correlation between the true score & measurement error
3) all of the observed variance can be explained by the true score variance & measurement error (there are no other sources of noise)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Classical Test Theory - Definition

A

Measurement theory that defines the conceptual basis of reliability & outlines procedures from estimating the reliability of psychological test scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Measurement Error - Definition

A

Extent to which other characteristics contribute random noise to the differences in observed scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Reliability - Definition

A
  • Measure of whether something is consistent (stays the same)
  • Results are considered to be reliable if they are similar each time they are carried out using the same design, procedures, measurements
  • Extent to which differences in respondents observed scores are consistent with differences in their true scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are two ways of defining reliability?

A

1) As a proportion of variance
2) As a proportion of shared variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Reliability (as a proportion of variance) - Formula

A

True score variance/observed score variance

1 - (error score variance/observed score variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Reliability (as a proportion of variance) - Definition

A

The proportion of observed score variance that is attributable to true score variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reliability (as a proportion of shared variance) - Definition

A

The proportion of variance shared between the true scores & observed scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reliability (as a proportion of shared variance) - Formula

A

(correlation observed+true)²

1 - (correlation observed+error)²

Note: squaring a correlation gives you the amount of variance shared by those variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the four test models of reliability?

A

1) Parallel Test Model
2) Tau Equivalent Test Model
3) Essential Tau Equivalent Test Model
4) Congeneric Test Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

List the test models of reliability from most to least restrictive

A

1) Parallel Test Model
2) Tau Equivalent Test Model
3) Essential Tau Equivalent Test Model
4) Congeneric Test Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Parallel Test Model - Assumptions

A

True Scores:
= mean
= variance

Observed Scores:
= mean
= variance

Error Scores:
= variance

Correlation:
= correlation true&observed

Reliability:
= R (test 1 & 2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Parallel Test Model - Assumptions True Scores

A

= mean
= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Parallel Test Model - Assumptions Observed Scores

A

= mean
= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Parallel Test Model - Assumptions Error Scores

A

= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Parallel Test Model - Assumptions Reliability

A

= reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Parallel Test Model - Tests

A

1) Alternate Forms
2) Split-Halves
3) Test-Retest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Tau Equivalent Test Model - Assumptions

A

True Scores:
= mean
= variance

Observed Scores:
= mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Tau Equivalent Test Model - Assumptions True Scores

A

= mean
= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Tau Equivalent Test Model - Assumptions Observed Scores

A

= mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Tau Equivalent Test Model - Assumptions Error Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Tau Equivalent Test Model - Assumptions Reliability

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Tau Equivalent Test Model - Tests

A

1) Alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Essential Tau Equivalent Test Model - Assumptions

A

True Scores:
= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Essential Tau Equivalent Test Model - Assumptions True Scores

A

= variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Essential Tau Equivalent Test Model - Assumptions Observed Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Essential Tau Equivalent Test Model - Assumptions Error Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Essential Tau Equivalent Test Model - Assumptions Reliability

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Essential Tau Equivalent Test Model - Tests

A

Cronbach’s Alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Congeneric Test Model - Assumptions

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Congeneric Test Model - Assumptions True Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Congeneric Test Model - Assumptions Observed Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Congeneric Test Model - Assumptions Error Scores

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Congeneric Test Model - Assumptions Reliability

A

none

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Congeneric Test Model - Tests

A

Omega

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Parallel Test Model - True Score Formula

A

Xt2 = Xt1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Tau Equivalent Test Model - True Score Formula

A

Xt2 = Xt1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Essential Tau Equivalent Test Model - True Score Formula

A

Xt2 = a + Xt1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Congeneric Test Model - True Score Formula

A

Xt2 = a +bXt1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Parallel Test Model - Observed Score Formula

A

Xo1 = Xt1 + Xe1
Xo2 = Xt1 + Xe2

44
Q

Tau Equivalent Test Model - Observed Score Formula

A

Xo1 = Xt1 + Xe1
Xo2 = Xt1 + Xe2

45
Q

Essential Tau Equivalent Test Model - Observed Score Formula

A

Xo1 = Xt1 + Xe1
Xo2 = a + Xt1 + Xe2

46
Q

Congeneric Test Model - Observed Score Formula

A

Xo1 = Xt1 + Xe1
Xo2 = a + bXt1 + Xe2

47
Q

What are the main methods of reliability estimation?

A

1) Alternate Forms
2) Test-Retest
3) Internal Consistency

48
Q

What are the different tests in the alternate forms method?

A

2 alternate forms/versions of the same test

49
Q

What test model does the alternate forms method follow?

A

parallel test model

50
Q

What is the reliability in the alternate forms method?

A

reliability is the correlation (between the 2 test versions)

51
Q

What are limitations of the alternate forms method?

A
  • It is very difficult in practise to create two versions of a test that are unique yet still parallel
  • carryover effects
52
Q

Carryover Effects - Definition

A

An effect of being tested in one condition on participants behaviour in later conditions

53
Q

What are the different tests in the test-retest method?

A

the same person takes the same test on more than one occasion

54
Q

What test model does the test-retest method follow?

A

Parallel test model

55
Q

What is the reliability in the test-retest method?

A

Reliability is the correlation (of the two test taking occasions)

56
Q

What are limitations of the test-retest method?

A
  • Difficult to do for constructs that naturally fluctuate over time (change in true scores)
  • Carryover effects
  • People might not want to take the test a second time
57
Q

What are the different tests that fall under the general internal consistency method?

A

1) Split-Half
2) Cronbach’s Alpha
3) Omega

58
Q

What are limitations of the general internal consistency method?

A

Carry over effects can cause there to be a correlation between the error scores of different items

59
Q

What are the different tests in the internal consistency method?

A

(Blocks of) items are treated as separate tests

60
Q

What test model does the internal consistency method follow?

A

Parallel test model OR essential tau equivalent model

61
Q

What are the different reliability measures under Cronbach’s Alpha?

A
  • Raw Alpha
  • Standardized Alpha
  • KR20
62
Q

What test model does Cronbach’s Alpha follow?

A

Essential tau equivalent model

63
Q

What are limitations of Cronbach’s Alpha?

A
  • assumptions are hardly ever met in reality
  • Cronbach’s Alpha is lower bound to the reliability (it will underestimate the reliability)
64
Q

What are the different tests in Cronbach’s Alpha?

A

1) Raw Alpha
2) Standardised Alpha
3) KR20

65
Q

When should you use Raw Alpha?

A

For tests with items that do not substantially differ in their variances

66
Q

Raw Alpha - Consistency Index

A

Sum of all covariances amongst items (ΣC ᵢ ᵢ)

67
Q

When should you use Standardised Alpha?

A

For tests with items that substantially differ in their variances, which causes the test scores to only reflect items with very high variances

68
Q

Standardised Alpha - Consistency Index

A

Average of all correlations amongst items (r ᵢ ᵢ,)

69
Q

When should you use KR20?

A

For binary items

70
Q

KR20 - Consistency Index

A

Sum of item variances (Σpq)

71
Q

What are the different tests in Omega?

A

Each item of the test is considered to be a separate test

72
Q

What test model does Omega follow?

A

Congeneric test model

73
Q

What is reliability in Omega?

A

Reliability is the:
true score variance/observed score variance

74
Q

What reliability estimation method is omega apart of?

A

Internal Consistency

75
Q

What reliability estimation method is split-halves apart of?

A

Internal Consistency Method

76
Q

What reliability estimation method is alpha apart of?

A

Internal Consistency Method

77
Q

What are the different tests in split-halves?

A

Test is split into 2 parts

78
Q

What test model does split-halves follow?

A

Parallel test model

79
Q

What is the consistency index in split-halves?

A

Reliability is the correlation (between test halves)

80
Q

What are the limitations of split-halves?

A
  • Reliability is heavily influenced by the type of split done
  • It cannot be used for speeded tests, as you will almost always get a correlation close to 1.0. This is because response speeds are consistent throughout the entire test.
  • Other methods utilise more information about the test
81
Q

What are factors that affect reliability?

A

1) Test Length
2) Sample Heterogeneity
3) Reliability of Difference Scores

82
Q

How does test length influence the reliability of a test?

A

reliability will increase with more items added

83
Q

How does sample heterogeneity influence the reliability of a test?

A

In homogenous samples, the reliability is lower because of lower true score variance. In heterogenous samples, the reliability is higher because of greater true score variance. Both of these are not desirable, as reliability should be a property of the test, and not a property of the sample being examined.

84
Q

Reliability Generalisation Study - Definition

A

Study intended to reveal the degree to which a test produces differing reliability estimates across different kinds of research uses & populations (aka how sample characteristics affect the reliability of test scores)

85
Q

Difference Score - Formula

A

posttest score - pretest score

86
Q

What influences the reliability of difference scores?

A

The correlation between pretest & postest. If the correlation is very high, the reliability is low.

87
Q

What is the difference score sensitive to?

A

The variance between pretest & post-test. However, this is not as relevant for pretest -posttest designs as the difference will never be that large

88
Q

What are the approaches to true score estimation?

A

1) True score estimate is the summed item score
2) True score estimate is the summed item score, corrected for regression to the mean. The lower the reliability is, the more the true score estimate is corrected to be closer to the mean.

89
Q

Regression to the mean - Definition

A

If one sample of a variable is extreme, the next sampling is likely to be closer to the mean.

90
Q

What does standard error (sem) represent?

A

The average size of error scores

91
Q

What is the relation between reliability & standard error?

A

The higher the reliability, the lower the standard error, and vice versa

92
Q

Attenuation - Definition

A

lessening/weakening in the intensity, value, or quality of a stimulus. In terms of reliability, attenuation refers to the fact that the effect sizes/correlations of the observed scores will always be smaller than that of the true scores, due to including the measurement error.

93
Q

What is the effect of reliability on statistical significance?

A

Reliability has a direct effect on statistical significance- With high reliability, larger observed effect sizes are possible, which increases the likelihood of a significant result.

94
Q

Point estimate - Definition

A

Specific value that is interpreted to be the best estimate of an individuals standing on a particular psychological attribute

95
Q

Internal Consistency - Definition

A

Degree to which differences amongst participants responses to one item are consistent with differences amongst their responses to other items on the test (how consistent the test items are with each other)

96
Q

Item Discrimination - Definition

A

Degree to which an item differentiates people who score high on the total test from those who score low on the total test

97
Q

Item-total Correlation - Definition

A

Degree to which differences amongst participants responses to the item are consistent with differences in their total test score. If the correlation is high, then the item is highly consistent with the total test scores.

98
Q

Corrected Item-total Correlation - Definition

A

The consistency between an item and the other items on a test (correlation between responses to item one and the sum of all other items on the test)

99
Q

How to interpret “Cronbach’s Alpha if Item Deleted” on an Item-Total Statistics Table?

A

Items that increase alpha when dropped should be removed. However, only remove these items if the increase in reliability is deemed important enough.

100
Q

How to interpret “Corrected Item-Total Correlation” on an Item-Total Statistics Table?

A

This table tells you the consistency between that item & other items on the test. Items with a high corrected item-total correlation also have high item discrimination.

101
Q

When do you use the Discrimination Index?

A

When analysing the Internal Consistency of a test with binary items

102
Q

What does the Discrimination Index show?

A

The proportion of high test scorers that answered the item corrected compared with the proportion of low test scorers that also answered the item correctly.

103
Q

How do you interpret a Discrimination Index value?

A

Higher DI values are indicative of greater internal consistency, as high and low test scorers differ significantly in the likelihood of answering the item correctly

104
Q

What is COTAN?

A

A Dutch committee involved in the evaluation of (new) psychological tests

105
Q

What are the COTAN guidelines for high impact inferences at the individual level?

A

good = r ≥ .9
satisfactory = .8 ≤ r ≤ .9
insufficient = r ≤ .8

(between .8 & .9)

106
Q

What are the COTAN guidelines for lower impact inferences at the individual level?

A

good = r ≥ .8
satisfactory = .7 ≤ r ≤ .8
insufficient = r ≤ .7

(between .7 & .8)

107
Q

What are the COTAN guidelines for inferences at the group level?

A

good = r ≥ .7
satisfactory = .6 ≤ r ≤ .7
insufficient = r ≤ .6

(between .6 & .7)