Chapter 4.2: Reliability Flashcards
Refers to stability or consistency of the measurement
Reliability
An index of reliability, a proportion that indicates the ratio
between the true score variance on a test and the total variance
Reliability Coefficient
○ Refers to the proportion of the total variance attributed to true
variance
○ The greater the proportion of the total variance attributed to
true variance, the more reliable the test
Reliability
Goals of Reliability
○ Estimate errors in psychological measurement
○ Devise techniques to improve testing so errors are reduced
All of the factors associated with the process of measuring
some variable, other than the variable being measured
Measurement Error
Categories of Errors:
➢ Source of error in measuring a targeted variable caused by
unpredictable fluctuations and inconsistencies of other
variables in the measurement process
➢ Primarily influences the reliability of the measurement
Random Error
Categories of Errors:
➢Source of error in measuring a variable that is typically
constant or proportionate to what is presumed to be the
true value of the variable being measured
➢Primarily influences the validity of the measurement
Systematic error
Sources of Error Variance:
The extent to which the score is affected by the content
sampled in the test and by the way the content is
sampled (Item sampling or Content sampling)
Test Construction
Sources of Error Variance:
– Test environment: Room temperature, level of lighting, amount of ventilation
and noise
– Test taker Variables: Pressing emotional variables, physical discomfort, lack of
sleep, and effect of drugs or medications
– Examiner-related variables: Examiner’s physical appearance and demeanor
Test Administration
Sources of Error Variance:
– NOT ALL test can be scored by computer
– Scorers and scoring system are potential source of error
variance
– If subjectivity is involved in scoring, then the rater can be a
source of error variance
– Subjectivity in scoring can even enter in behavioral
assessment
Test Scoring and Interpretation
Reliability Estimates:
– Using the same instrument to measure the same thing as
two points in time
– The results of evaluation is called test-retest reliability
– 1 group; 2 different administration
– Measure something that is relatively stable over time
such as personality
Test-Retest Reliability Estimates/ Time Sampling
Method
interval between testing is 6
months
Coefficient of Stability
most appropriate in reaction time
and perceptual judgment
Test-Retest Reliability
Reliability Estimates:
Each form of the test, the means and the variances of
observed test scores are equal
Parallel Forms
Reliability Estimates:
– Simply different versions of a test that have been
constructed so as to be parallel
– Applicable on relatively stable traits
Alternate Forms
Uses a Coefficient of equivalence
Parallel Forms and Alternate Forms Reliability Estimates/
Item Sampling
Uses a Coefficient of Stability
Test-Retest Reliability Estimates/ Time Sampling
Method
Assesses the correlation between multiple items in a
test that are intended to measure the same construct
○ Split-half reliability
○ Kuder-Richardson formula
○ Cronbach Alpha
Internal Consistency
– Correlating two pairs of scores obtained from equivalent
halves of a single test administered once
– The reliability of the test is directly related to the length
of the test
– The source of error variance is content sampling
Split-Half Reliability Estimates
Allows a test developer or user to estimate internal
consistency reliability from a correlation of two halves of a
test
Spearman-Brown Formula
– Used to determine the number of items needed to attain a
desired level of reliability
Spearman-Brown Prophecy
items in a scale are unifactorial
Homogenous
composed of items that measure more than one trait
Heterogeneous
- Developed by Cronbach
- It is the preferred statistic for obtaining an estimate of internal
consistency reliability
Coefficient Alpha
A value of .90 or above indicates _______ __ ____
redundancy of items
❏A relatively new measure for evaluating the internal
consistency of a test
❏A measure used to evaluate the internal consistency of a test
that focuses on the degree of difference that exists
between item scores
Average Proportional Distance (APD)
It is the degree of agreement of consistency
between two or more scorers (judges or raters)
concerning a particular measure
Inter-Scorer Reliability
Reliability Ranges:
1 =
perfect reliability
Reliability Ranges:
≥ 0.9 =
excellent reliability
Reliability Ranges:
≥ 0.8 < 0.9 =
good reliability
Reliability Ranges:
≥ 0.7 < 0.8 =
acceptable reliability
Reliability Ranges:
≥ 0.6 < 0.7 =
questionable reliability
Reliability Ranges:
≥ 0.5 < 0.6 =
poor reliability
Reliability Ranges:
< 0.5 =
unacceptable reliability
Reliability Ranges:
- 0 =
no reliability