Reliability Flashcards
Who was Francis Galton?
childhood genius, Charles Darwin, human differences; first person to apply statistical methods in the differences between humans; found that the mean of OX weight guesses was the closest to the OX’s actual weight, even though the median would be best because it takes out extreme scores, which ended up accounting for each other as seen in the mean; plotted the scores and showed a bell curve. Came up with the idea that “error cancels out”
What is Galton’s Ox?
guessing the weight of the OX; law of errors: the average of guesses is correct
What is reliability?
reliability refers to the degree to which test scores are free from errors of measurement; the accuracy of a mean score from an unreliable test depends on how they are distributed, e.g., if they are unevenly distributed it’ll be harder to tell the true score
What is the goal of psychological measurement?
Detect psychological differences. To be reliable.
test scores are used to indicate levels of psychological attributes; differences among people’s test scores are used to indicate true psychological differences among people; to what degree are differences in observed (test) scores consistent with differences in (true) levels of psychological attributes?
What is Classical Test Theory?
x (observed) = T (true ability) + e (random error)
What are the assumptions of classical test theory?
- Observed scores on a psychological measure are determined by a respondent’s true score and by measurement error
- Error is random (consequences: 1. Errors tends to cancel itself out across respondents 2. Errors scores are uncorrelated with true scores)
classical test theory says X=T+E; E(X)=T; if a person were to take a test repeatedly then the average would cancel out any error (testing is independent from one test to another AND error is random and independent) e.g., people guessing Ox’s weight didn’t powwow beforehand
What is signal and noise referring to?
true score is “signal”; measurement error is “noise,” obscuring the signal; observed score is affected by signal and noise; reliability = signal / (signal + noise); if a person takes a test on two different occasions under the same conditions we would expect the same result: 1. That the conditions under which the test was taken were exactly the same 2. That the person’s underlying true score did not in fact change; it’s about how closely observed (X) approach true scores (T); X=T
How do you estimate reliability?
Repeated Measures: logic: if the test was not reliable, then you would expect to see a difference in a person’s observed score, such that the score from Time 1 would be slightly different than the score from Time 2 score; if the test conditions were not the same then any unreliability might be due to the difference in conditions and not the unreliability of the test itself; it is possible that scores from both T1 and T2 could both be very poor estimates of the person’s true score for reasons other than the test; taking a mean of repeated measures is believed to be better estimate of the person’s true ability: sometimes the error would increase the score, sometimes decrease it, but after a while, the score would average out
What is the relationship between error on different test scores?
Errors in the test score (E) will not be correlated with the test score (X)—errors should not be correlated on whether a person scored high or low on a test; correlation of E and X=0 (this can happen is error is random and independent); errors from two different tests are uncorrelated (parallel tests—errors from one test will be uncorrelated to true score on another test), conditions of test administration have to be exactly the same (very unlikely, however)
What is the goal of reliability theory?
estimate errors in measurement to suggest ways of improving tests so that errors are minimized; errors are random across a large number of individuals; variance of obtained scores = Sx^2=ST^2+Se^2, V observed score=V true score+V error score
What is variance?
reflects the extent to which individuals differ (compared to the test mean); it is determined by the degree to which the scores in the group actually differ
How does random error impact distribution?
distributions with error have higher variance compared to scores with no random error, however mean stays the same as error cancels out
What is the reliability coefficient?
the ratio of the true score variance to the total test of test scores; the proportion of the true variance to the total test of scores, that’s due to/ accounted for in all the variability in the test scores. Variance of true scores over variance of observed scores( true scores plus error).
What are the two type of error?
- Random error
- Systematic error
X = T + e (er random error and es systematic error)
What is systematic error?
Any bias; the systematic error impacts the average, so we call it a bias