reliability pt 3 Flashcards
inter rater reliability
-Applies when judgment must be exercised in scoring responses (e.g., WAIS-IV VCI subtests)
- Item level (agreement on item scores)
- How much agreement on each individual item?
- Correlation between raters on assigned item scores
- Scale level (total score on scale)
- How much agreement on the total score?
- Correlation between raters on total scores
If there are more than two raters, take the mean of the correlations for each pair of raters (A & B, B & C, A & C)
inter rater reliability categorical decisions
When there is a finite number of categories to which each person being rated can be assigned
- Items: Pass/Fail (0,1)
- Items: 0, 1, 2
- Diagnosis: Present/Absent
Two methods for assessing:
- Percent Agreement
- Kappa
inter rater reliability: percent agreement
- Percentage of all cases for which both raters make the same decision (i.e., both assign a score of 0 or both assign a score of 1)
- Problem: Raters could agree simply by chance
- Percent agreement can OVERESTIMATE inter-rater reliability
- Kappa (Κ) takes chance agreement into account and is the preferred method for assessing inter-rater reliability
how to calculate percent agreement
Two raters independently decide whether an item score should be 0 or 1 for N individuals who complete the item
A = number of times item was scored 0 by both #1 and #2
B = number of times item was scored 0 by #1 and 1 by #2
C = number of times item was scored 1 by #1 and 0 by #2
D = number of times item was scored 0 by both #1 and #2
Percent agreement = percentage of cases for which both raters gave the same score (either both 0 or both 1) = (A+D)/N
calculating chance agreement for a score = 0
Total scores of 0 given for Rater #1 = A + B
Total scores of 0 given for Rater #2 = A + C
Proportion of cases given a score of 0 by Rater #1 = (A+B)/N
Proportion of cases given a score of 0 by Rater #2 = (A+C)/N
Chance agreement for a score of 0 =
(A+B)/N times (A+C)/N
calculating chance agreement for score = 1
Total scores of 1 given for Rater #1 = C + D
Total scores of 1 given for Rater #2 = B + D
Proportion of cases given a score of 1 by Rater #1 = (C+D)/N
Proportion of cases given a score of 1 by Rater #2 = (B+D)/N
Chance agreement for a score of 1 =
(C+D)/N times (B+D)/N
calculating total chance agreement
Add the chance agreement for a score of 0 to the chance agreement for a score of 1
(A+B)/N times (A+C)/N
PLUS
(C+D)/N times (B+D)/N
implications of reliability
- There is no single value that represents the reliability of a test … we must specify which type of reliability we are estimating
- The methods we have considered all permit us to estimate a specific type or source of error
- To estimate multiple sources of error simultaneously Generalizability Theory
- Test manuals will report all relevant types of reliability (test/retest; split-half; internal consistency; inter-rater)
standard error of measurement
- Reliability coefficients apply to the test itself
- The SEM permits us to estimate how much error is likely to be present in an individual examinee’s score
SEM in words
- Step 1. Subtract the reliability of the test from 1.
- Step 2. Take the square root of Step 1.
- Step 3. Multiply the standard deviation of the test by Step 2.
SEM and reliability
- The SEM is INVERSELY -
- If reliability is high, SEM is low
- If reliability is low, SEM is high
standard error of measurement according to classical reliability theory
According to Classical -Error is normally distributed around a mean of 0
-SEM = the standard deviation of the distribution of error scores
Using the probabilities associated with the normal curve
- The probability is 68% that the amount of error is within 1 SEM
- The probability is 95% that the amount of error is within 2 SEM
estimating error
We can use the SEM to make probability statements about the amount of error associated with an observed score
NOTE: To do this accurately, we have to use the exact values rather than the “approximate” values we used in Chapter 1 of the Manual
- The probability is 68% that the amount of error associated with an observed score is no more than +/- 1 SEM
- The probability is 90% that the amount of error associated with an observed score is no more than +/- 1.65 times SEM.
- The probability is 95% that the amount of error associated with an observed score is no more than +/- 1.96 times SEM.
confidence intervals for estimated true score
We can also construct confidence intervals around the estimated true score
-We can’t know the actual true score, but we can estimate it.
These confidence intervals tell us the range in which the person’s true score is likely to fall with a specified degree of certainty (probability)
These are the CI’s that are given in the table in the WAIS-IV Manual
Step 1. Calculate the estimated true score
Step 2. Calculate the standard error of estimate
Step 3. Calculate the desired confidence interval
estimating the true score formula in words
Step 1. Subtract the Mean (M) from the observed score (Xo)
Step 2. Multiply Step 1 by the reliability of the test (rtt)
Step 3. Add the Mean to Step 2.