Chapter 4-reliability Flashcards
Classical Test Theory (CTT): Assumptions (4)
(1) Each person has a true score that would be obtained if there were no errors in measurement. Observed test score (X) = True test score (T) + Error (E)
(2) Measurement errors are random
(3) Measurement error is normally distributed
(4) Variance of OBSERVED scores = Variance of true scores + Error variance
Reliable test
One we can trust to give us the same score for a person every time it is used.
Can a measurement instrument be perfectly reliable?
No. No measurement instrument is perfectly reliable
A person’s true score def
The hypothetical or ideal measure of a person’s attribute we aim to capture with a psychological test.
=> FREE FROM ERROR
Expected score over an INFINITE number of independent administrations of the test
Independent administration def
Each time the test is taken is UNRELATED to PREVIOUS or FUTURE administrations
-> The person’s performance on one occasion doesn’t influence their performance on another.
Mean error of measurement = ____
Errors are _______ with each other
True scores and errors are _______
0; UNcorrelated; UNcorrelated
Two tests are parallel if: (3)
(1) EQUAL observed score MEANS
-> Comes from the assumption that True scores would be the same
(2) EQUAL ERROR VARIANCE
(3) SAME CORRELATIONS with other tests
Random error characteristics (3)
(1) Random
(2) Cancels itself out
(3) Lowers reliability of the test
Systematic error characteristic
Occurs when source of error always increases or decreases a true score
-> DOESN’T LOWER RELIABILITY of a test since the test is RELIABLY INACCURATE by the same amount each time
Sources of Measurement Error (3)
(1) CONTENT Sampling Error
(2) TIME Sampling Error
(3) Other Sources of Error (e.g. observer differences)
Content Sampling Error characteristics (3)
(1) Results from differences between the SAMPLE of items (i.e., the test) and the DOMAIN of items (i.e., all the possible items)
(2) When test items may not be representative of the domain from which they are drawn
(3) Low when test items are representative of the domain
Time Sampling Error characteristics (2)
(1) Results by the choice of a particular time to administer the test
(2) Random fluctuations in performance from one situation or time to another
Other Sources of Error characteristics (2)
(1) Scoring or administrative error
E.g., Adding scores with one another
(2) Tests scored or graded by different scorers
Reliability Coefficient
Proportion of OBSERVED test scores accounted for by variability in TRUE scores.
=> Ratio of the variance of the true scores on a test to the variance of the observed scores
=> Measure of the accuracy of a test or measuring instrument obtained by measuring the same individuals twice and computing the correlation of the two sets of measures
Standard Error of Measurement (SEM) def
Indicates the amount of uncertainty/error expected in an individual’s observed test score.
=> Corresponds to the SD of the distribution of scores one would obtain by repeatedly testing a person.
=> SD of the distribution of random errors around the true score!!
Standard Error of Measurement (SEM) allows us to quantify the _______.
Amount of variation in a person’s observed score that measurement error would most likely cause
High Reliability = ___ SEM
Low Reliability = ___ SEM
High reliability = Low SEM
Low reliability = High SEM
Confidence Interval (CI) def
Confidence interval (CI) is a range of scores that we feel confident will include the true score.
CI is used to compare scores to avoid ______________.
over-emphasizing differences
Reliability of test can be increased by _______.
adding items
Spearman-Brown formula def
Predicts the effect of lengthening or shortening a test on reliability.