Reliability & Validity Flashcards
What can high reliability guarantee?
Consistency
How can one test reliability?
Test-retest correlation
How does test-retest correlation work?
Administering an instrument twice to same population
How does one avoid the practice effect when doing test-retest?
Time difference must be long enough
But short enough so the underlying state does not change
Time difference for test-retest in psychiatric studies
2-14 days
What measures internal consistency of a test?
Cronbachs alpha
How does cronbachs alpha test internal consistency?
By correlating each item with the total score and averaging the correlation coefficient
Values of Cronbachs alpha
Negative infinity to 1
What Cronbachs alpha values make sense
Positive values only
Cut off for Cronbachs alpha for a test to be internally consistent?
0.7
What is split half reliability?
Splitting scale in two parts and examining the correlation
What is intraclass correlation coefficient used for?
Continuous variables
What is the intraclass correlation coefficient?
Proportion of total variables of measurement that reflects true between subject variability
Range of intraclass correlation coefficient?
0 (unreliable) - 1 (perfect reliability)
What can ICC be measured for?
Relative or absolute agreement
Difference between relative and absolute agreement
Relative ICC is always higher
Levels of ICC and their meanings
- 6 = fair
- 8 = very good
- 9 = excellent
What is ANOVA intraclass coefficient used for?
Quantitative data with more than 2 rates/groups
What is used to test relaibility for nominal data with more than 2 categories?
Kappa or weighted kappa
What is face validity?
Subjective measure of deciding whether a test measures the construct of interest at face value
Types of construct validity
Content Criterion Convergent Discriminant Experimental
What is criterion validity made up of
Concurrent
Predictive
What is construct validity?
Measures whether a test really measures the construct of interest
What is unified construct validity?
Both content and criterion validity
What is content validity?
Whether the contents of the test are in line with specifications the test was designed to measure
What does content validity look for?
Good coverage of all domains thought to be related to the measured condition
How does one measure content validity?
Cannot be statistically tested
Experts are called to comment on this validity
What is criterion validity?
Performance of a test against an external criterion such as an instrument or future diagnstic possibility
What is concurrent validity?
Ability of a test to distinguish between subjects who differ concurrently in other measures (using other instruments)
What is predictive validity?
Ability of a test to predict future group differences according to current group scores
What is incremental validity?
Ability of a measure to predict or explain variance over and above other measures
What can one divide construct validity into?
Concurrent & predictive
Convergent, discriminant & experimental
Factorial
What is convergent validity?
Agreement between instruments that measure same construct
What is discriminant validity?
Degree of disagreement between two scales measuring different constructs
What is experimental validity?
Sensitivity to change.
What is factorial validity?
Established via factor analysis of items in a scale
What is precision?
Degree to which the mean varies with repeated sampling
What leads to imprecision?
Random errors
Factors that reduce precision
Wide interval limits
Expecting higher CI
What is accuracy?
Correctness of the mean value i.e. how close it is to the true population value
What compromises both validity and accuracy?
Bias
Disadvantages of percent agreement?
Overestimates degree of agreement
What does kappa indicate?
Level of agreement that could be expected beyond chance
What is kappa used for?
Agreement on categorical variables
What is weighted kappa used for?
Ordinal variables
What is used for beyond chance agreement in continuous variables?
Bland-Altman plot
Degree of agreement if kappa is 0
None
Degree of agreement if kappa is 0-0.2
Slight
Degree of agreement if kappa is 0.2-0.4
Fair
Degree of agreement if kappa is 0.4-0.6
Moderate
Degree of agreement if kappa is 0.6-0.8
Substantial
Degree of agreement if kappa is 0.8-1.0
Almost perfect
What affects kappa?
Prevalence of outcome studied - higher proportion of assessments leads to higher kappa
Calculations for kappa
(observed agreement beyond chance) / (maximum agreement beyond chance)
OR
(observed agreement - agreement by chance) / (100% - agreement expected by chance)
What numerical values are needed to calculate kappa?
Percentage of patients that the 2 assessors correctly classified
Expected agreement be chance
What is kappa dependent on?
Prevalence of measured condition
What type of disorders will kappa be low for?
Common disorders
Disadvantage of kappa
One cannot test statistical significance from kappa values
What is another way of calculating beyond chance agreement for nominal values
Phi
Advantages of phi
Statistical significance testing is possible
Small sample size can be used
What is plotted in bland-altman plot?
Pairs of score differences are plotted against the mean