Reliability Flashcards
Mastery
Reliable = consistent ≠ perfect
* Top three most reliable cars according to consumers reports
(2019 based on 10 years of service history)?
* Toyota 4Runner
* Toyota Prius
* Toyota Camry
- Is any measurement perfect?
… - What is an attribute that makes a measurement good?
…
- Is any measurement perfect?
Probably not… - What is an attribute that makes a measurement good?
If it’s reliable…
- Reliability
- what is it?
- If you measure the same thing, would you…
- Can our measure be confirmed by further measurements or observations?
- Reliability
- … in scores
- Sources of errors
- Participant
- Test administrator
- Testing
- Scoring
- Instrumentation
- Environment
- Systematic vs. Unsystematic Errors
- Measurement error
- Bias
- Confounding variability
- Reliability
- Consistency or stability of measurement
- If you measure the same thing, would you get the same score?
- Can our measure be confirmed by further measurements or observations?
- Reliability
- Consistency in scores
- Sources of errors
- Participant
- Test administrator
- Testing
- Scoring
- Instrumentation
- Environment
- Systematic vs. Unsystematic Errors
- Measurement error
- Bias
- Confounding variability
Measurement Error
* A measurement error is the difference between…
* They almost always …
* Includes systematic and random error
* The estimate of measurement error is Reliability
* Measured score = …
- Define an operational definition that minimizes error.
- Develop protocol specific enough to objectively measure the concept/variable being studied, then check reliability of measurement and measurer(s)
- Method/instruments used to measure something with sufficient detail/standardization to allow for replication of the results using properly trained evaluators/testers
- Standard of measurement conducted by trained individuals who will be applying a standard set of criteria
Measurement Error
* A measurement error is the difference between an observed value, and an actual value
* They almost always occurs
* Includes systematic and random error
* The estimate of measurement error is Reliability
* Measured score = true score + measurement error
Systematic Error vs. Random Error
* Systematic Error (… error)
* A … can result in a systematic error sometimes
* Systematic Errors are:
* …able
* One …
* … true score
* Can correct for or recalibrate
- Random Error (… variable present)
- Due to …
- …able
- Can occur for a variety of reasons
- With many trials, it would eventually …, or cancel out, so average score is a good estimate of the true score
- Can be the result of a … variablity
Systematic Error vs. Random Error
* Systematic Error (bias error)
* A bias can result in a systematic error sometimes
* Systematic Errors are:
* Predictable
* One direction
* Over or underestimate true score
* Can correct for or recalibrate
- Random Error (confounding variable present)
- Due to chance
- Unpredictable
- Can occur for a variety of reasons
- With many trials, it would eventually mitigate, or cancel out, so average score is a good estimate of the true score
- Can be the result of a confounding variablity
Inter-rater reliability (…)
* A coefficient that assesses the agreement of
observations made by…
Inter-rater reliability =
Inter-rater reliability (between)
* A coefficient that assesses the agreement of observations made by 2 or more raters
Inter-rater reliability = number of agreements X 100% / number of possible
Intra-rater reliability (…)
The stability of data recorded by …
Reliability is established with … trials
This could be a source of … Bias
Intra-rater reliability (within the rater)
The stability of data recorded by one individual
across two or more trials
Reliability is established with multiple trials
This could be a source of Rater Bias
Inter-rater reliability (…)
Intra-rater reliability (…)
Evaluation by different “measurers” several times
Evaluation by the same “measurer” several times
Test-Retest
* The results of both …
* Reliable over …
After a day/week/month
Change the order of the test procedure
Test-Retest
* The results of both test and retest should be consistent if the test is reliable
* Reliable over time
Alternate forms reliability
* Use …
* Reliable over … (…)
Alternate forms reliability
* Use alternate forms of the testing instrument for one
measurement
* Reliable over time (body composition - %body fat)
Internal Consistency
* Ideal cake should taste …
* Split half reliability…
* Cronbach’s Alpha (similar to split half but more general)
* The average of all possible split halves
* How closely related a set of items are as a group.
* Considered to be a measure of scale reliability
Internal consistency Split Half example
* Quiz 4 stats: Class average 84.02% (corrected)
* Questions 1-13 class average …%
* Questions 14-26 class average …%
* There was very good split half reliability in our quiz
- .00 to 1.0
- .00 …
- 1.0 … consistency in measurement
- 0.70 means that 70% of the variance is reliable variance
- It also indicates that 30% of the variance is due to error.
- How are items correlated
- 80% of people who got question 1 correct, got question 4 and 5 correct, and q 3 wrong. The correlation between those values would be 0.8 and the error would be 0.2
Internal Consistency
* Ideal cake should taste the same throughout
* Split half reliability. compare scores of each half
* Cronbach’s Alpha (similar to split half but more general)
* The average of all possible split halves
* How closely related a set of items are as a group.
* Considered to be a measure of scale reliability
Internal consistency Split Half example
* Quiz 4 stats: Class average 84.02% (corrected)
* Questions 1-13 class average 84.01%
* Questions 14-26 class average 84.03%
* There was very good split half reliability in our quiz
- .00 to 1.0
- .00 no consistency
- 1.0 perfect consistency in measurement
- 0.70 means that 70% of the variance is reliable variance
- It also indicates that 30% of the variance is due to error.
- How are items correlated
- 80% of people who got question 1 correct, got question 4 and 5
correct, and q 3 wrong. The correlation between those values would
be 0.8 and the error would be 0.2
TEST
Methods of Establishing Reliability
Determining stability:
* …, … over time
* Same-day …—…
Constructing alternate forms:
* Determining … Standard
* Construct … forms and check reliability to … standard
Obtaining internal consistency:
* … technique
Determining stability:
* Test, Retest over time
* Same-day test—retest
Constructing alternate forms:
* Determining Gold Standard
* Construct alternate forms and check reliability to gold standard
Obtaining internal consistency:
* Split-half technique