Reliability Flashcards
an index of reliability, a proportion that indicates the ratio between the true score variance on a test and the total variance
reliability coefficient
the prerequisite of validity
high reliability
reliability increases with [ ]
test length
standard deviation squared. is useful because it can be broken into components
variance
variance from true differences
true variance
variance from irrelevant, random sources
error variance
refers to the proportion of the total variance attributed to true variance
reliability
sources of variance
test construction
administration
scoring
interpretation
variance in test construction
item sampling or content sampling
variance in test administration
test environment
testtaker variables
examiner-related variables
test scoring and interpretation
scorers and scoring system
an estimate of reliability obtained by correlating pairs of scores from the same people on two different administrations of the same test
test-retest reliability
how stable is the construct or measure
coefficient of stability
If the duration of test-retest is too short, there is a tendency for [ ]
carryover effect/practice effect
test-retest is not applicable for [ ]
states
how to measure test-retest reliability
pearson r or spearman rho
The consistency of test results between two different – but equivalent – forms of a test.
parallel forms and alternate-forms reliability
for each form of the test, the means and the variances of observed test scores are equal.
parallel forms
are simply different versions of a test
that have been constructed so as to be parallel.
alternate forms
coefficient for parallel and alternate forms
coefficient of equivalence
the advantage of having another form
eliminates carryover/practice effects
how to measure parallel and alternate forms reliability
pearson r or spearman rho
Defines measurement error strictly in terms of consistency or inconsistency in the content of the test.
internal consistency reliability
obtained by correlating two pairs of scores obtained from equivalent halves of a single test administered once.
split-half reliability estimate
three steps in split-half reliability estimate
Step 1. Divide the test into equivalent halves.
Step 2. Calculate a Pearson r between scores on the two halves of the test.
Step 3. Adjust the half-test reliability using the Spearman-Brown formula.
allows a test developer or user to estimate internal consistency reliability from a correlation of two halves of a test.
spearman-brown formula
-Used with ratio or interval data.
-Mean of all possible split-half correlations
-Preferred statistic for obtaining an estimate
of internal consistency reliability.
-Typically ranges in value from 0 to 1
cronbach’s coefficient alpha
used for test with dichotomous items, primarily those
items that can be scored right or wrong (such as multiple-
choice items). useful in terms of evaluating the internal
consistency of highly homogenous items
kuder-richardson formula
is used for items that have varying difficulty. For example, some items might be very easy, others more challenging. it should only be used if there is a correct answer for each
question
kr-20
it’s used for a test where the items are all about the same difficulty.
kr-21
refers to the degree of correlation among all the
items on a scale.
inter-item consistency
Ideally, the average inter-item correlation for a set of items should be between [ ] and [ ], suggesting that while the items are reasonably homogenous, they do contain sufficiently unique variance so as to not be isomorphic with each other.
.20 and .40
The degree of agreement or consistency between
two or more scorers (or judges or raters) with regard
to a particular measure.
inter-scorer reliability
how to measure inter-scorer reliability
pearson r or spearman rho
A reliability coefficient of .80 indicates that 20% of the variability in test scores are due to [ ].
measurement error
Coefficient of inter-rater reliability provides information about error as a result of [ ]
test-scoring
Coefficient of stability provides information on error as a result of
length in the time of administration
Coefficient of equivalence provides information on error as a result of [ ]
instrument (items) itself
high degree of internal consistency
test homogeneity
low degree of internal consistency
tests heterogeneity
a characteristic where the best estimate of reliability would be obtained from a measure of internal consistency.
dynamic characteristic
a characteristic where the test-retest or the alternate-forms method would be appropriate.
static characteristic
If the variance of either variable in a correlational analysis is
[ ] by the sampling procedure used, then the resulting correlation coefficient tends to be lower.
restricted
If the variance of either variable in a correlational analysis is [ ] by the sampling procedure, then the resulting
correlation coefficient tends to be higher.
inflated (increased)
all the items are of the same degree of difficulty. There is time limit within which the test taker is required to answer all the items.
speed test
assesses the underlying ability of the individuals by allowing
them sufficient time; no time limit.
power test
designed to provide an indication of where a test-taker stands with respect to some variable or criterion, such as an educational or a vocational objective.
criterion-referenced test