Study Guide 9: Reliability: Estimation, Interpretation, & Impact Flashcards
Adjusted true score estimate
it takes measurement error into account, and adjust the point estimate to the mean. Xest= X+Rxx (Xo – X)
Alternate forms reliability
method for estimating reliability of test scores by obtaining scores from two different forms of a test and computing the correlation between them.
(Cronbach’s) coefficient alpha
most widely used method for estimating reliability. useful for determining the extent to which the ratings from raters are internally consistent
Cohen’s Kappa
measures the agreement multiple raters, participants or measurement categories- who each classify items into mutually exclusive categories. It’s a more robust measure of agreement than a simple percentage aggrement because it corrects for agreement that would be expected by chance alone. Thus Kappa =0 is agreement by chance only, and 1.0=perfect chance-corrected agreement. 0-0.2=slight 0.21-0.40=fair 0.41-0.6=moderate 0.61-0.8=substantial 0.81-1.0=almost perfect
Composite score
if a test includes multiple items, and if the overall score for the test is computed from the responses to those items on a test, the overall score is the composite score.
Confidence interval or error band
reflect the accuracy or precision of the point estimate as reflective of an individual’s true score. The greater the sem the greater the average difference between observed scores and true scores.
95% confidence interval=Xo+/-(1.96)(sem)
68% (+ 1 SEM), 95% (+ 1.96 SEM), 99% (+ 2.58 SEM)
Correction for attenuation
is a statistical procedure, due to Spearman (1904), to “rid a correlation coefficient from the weakening effect of measurement error”
Essential tau equivalence
when two tests measure the same psychological construct. rests on more liberal assumptions than that of parallel tests (tau equivalence) – ie. the assumption of equal error variance is not required. Thus, estimates from alpha are likely to be accurate more often than those from methods like the split-half approach.
Internal consistency reliability ·
practical alternative to alternative forms or test-retest procedure.
Inter-rater reliability
- how repeatable the scores are when two or more different people are scoring or observing the same behavior
Kuder-Richardson formula 20 or KR-20
measure of internal consistency reliability for measures with binary items. It is analogous to Cronbach’s alpha, but for dichotomous choices.
Point estimate
single best estimate of the quantity of an underlying psychological attribute at the moment the individual took the test
Random (unsystematic) error
- is caused by any factors that randomly affect measurement of the variable across the sample. It does not have any consistent effects across the entire sample. Instead, it pushes observed scores up or down randomly. This means that if we could see all of the random errors in a distribution they would have to sum to 0 – there would be as many negative errors as positive ones. The important property of random error is that it adds variability to the data but does not affect average performance for the group.
Regression to the mean
likelihood that, upon a second testing, an individual’s score is likely to be closer to the group mean than was his or her first score.
Spearman-Brown correction
formula that allows you to calculate the reliability of a revised test (ie., a test that has been lengthened or shortened)