Week 6: Reliability, Validity, Epidemiologic Analysis and Dichotomizing Treatment Effect Flashcards
What is reliability?
Extent to which a measurement is consistent and free from error
All reliability scores have…
signal and noise
What is signal?
true score
What is noise?
error
Reliability is the ratio of…
signal to noise
relative reliability
ratio of total variability of scores compared to individual variability within scores
unitless coefficient
ICC and kappa
absolute reliability
indicates how much of a measured value is likely due to error
expressed in the original unit
SEM is commonly used
Standard error of measurement (SEM) for relative measure of reliability
ICC (and kappa)
Standard error of measurement (SEM) for absolute measure of reliability
SEM
Most common types of reliability
test-retest, inter-rater, intra-rater, internal consistency
inter-rater
2+ or more raters who measure the same group of people
intra-rater
the degree that the examiner agrees with himself or herself
2+ measurements on the same subjects
in measurement validity, the test should…
discriminate, evaluate, and predict
reliability is a __________ for validity
prerequisite
content validity
establishes that the multiple items that make up a questionnaire, inventory, or scale adequately sample the universe of content that defines the construct being measured
Criterion-related Validity
establishes the correspondence between a target test and a reference or ‘gold’ standard measure of the same construct
concurrent validity
the extent to which the target test correlates with a reference standard taken at relatively the same time
predictive validity
the extent to which the target test can predict a future reference standard
construct validity
establishes the ability of an instrument to measure the dimensions and theoretical foundation of an abstract construct
convergent validity
the extent to which a test correlates with other tests of closely related constructs
divergent validity
the extent to which a test is uncorrelated with tests of distinct or contrasting constructs
quantifying reliability: ‘old approach’
pearson’s r
assesses relationship
only 2 raters could be compared
Quantifying reliability: ‘modern’ approach
intraclass correlation coefficients (ICC)
cohen’s kappa coefficients
both ICCs and kappa give single indicators of reliability that capture strength of relationship plus agreement in a single value
ICC
values from 0 - 1.0
measures degree of relationship and agreement
can be used for > 2 raters
interval/ratio data
ICC types
six types depending on purpose, design, and type of measurements
ICC type is defined by
two numbers in parentheses
ex: ICC (2,1), ICC (model, form)
model 1
raters are chosen from a larger pop; some subjects are assessed by different raters (rarely used)
model 2
each subject assessed by the same set of raters
when is model 2 used
for test-retest and inter-rater reliability
model 3
each subject is assessed by the same set of raters, but the raters represent the only raters of interest
when do you use model 3
used for intra-rater reliability or when you do not wish to generalize the scores to other raters
ICC forms
second number in parentheses represents number of observations used to obtain reliability estimate
form 1
scores represent a single measurement
form k
scores based on mean of several (k) measurements
ICC interpretation
no absolute standards
ICC > 0.90
best for clinical measurements
ICC > 0.75
good reliability
ICC < 0.75
poor to moderate reliability
cronbach’s alpha (a)
represents correlation among items and correlation of each individual item with the total score
between 0.70 to 0..90
if cronbach’s alpha is too low it means
not measuring the same construct