Measurement Reliability Flashcards
Measurement Reliability
- extent to which repeated measurements agree with one another and are believable and useful
- also referred to as stability, consistency, and reproducibility
- sources of error include errors made by examiners, subject variability, and instrumentation flows or failures
Types of Measurement Reliability
- instrument: test-retest, internal consistency, parallel forms, split-half
- rater: intra-tester (within), inter-tester (between or among)
Test-Retest Reliability
-obtained by administering the same test twice over a period of time
Instrument Reliability
- internal consistency: measure of reliability used to evaluate the degree to which different test items that cover the same construct produce similar results; self-report instruments; grouping of questions that each measure different constructs or concepts; all items in one domain should not relate to another domain
- parallel forms: two forms of an instrument are administered; score equally on both instruments;
- split half: combine two forms of an instrument that cover the same concept into one longer version; compare scores on one half with the other
Quantification of Reliability
- relative: if measurement is reliable, individual measurements within a group will maintain their position within the group upon repeated measurement-very reliable but may not be near measurement score
- absolute: the extent to which a score varies upon repeated measurement-no change from day one to day two
Measurement Validity
- degree to which a measurement captures what it is intended to measure
- reliability is a necessary but not sufficient condition for validity…
- a test may be reliable because it consistently reports the same measurement
- however it may not be valid because the measurement is incorrect
Types of Measurement Valididty
- face validity
- content validity
- construct validity: convergent, discriminant
- criterion related validity: concurrent validity, predictive validity
Face Validity
- does the test or instrument appear, on the face of it, to assess what is intended
- addressed from the standpoint of the tester and from the standpoint of the patient or family member
Content Validity
- extent to which an instrument reflects all the meaningful elements of a variable
- judged by content experts or people with experience with the variable
- usually only pertinent to multidimensional measurements
- disability measures, functional measures, self-reported tools, knowledge assessment
- look at example on page 19 of 9/18 notes
Construct Validity
- degree to which a measure reflects the operational definition of the concept it is said to represent
- achieved via operational definitions, logical arguments, theoretical arguments, and research evidence
Forms of Construct Validity-Convergent Validity
- comparison of scores between two similar instruments expected to produce similar results
- positively correlate with each other
Forms of Construct Validity-Discriminant Validity
- differentiation among different levels of characteristics of interest
- degree of disability
- does an instrument or test differentiate between individuals with shoulder impingement and those without
Criterion Validity
- extent to which one measure is systematically related to other measures or outcomes
- required direct comparison of the index measure with a standard (criterion) measure
Forms of Criterion Validity-Concurrent Validity
- ability of an index measure to capture an outcome similar to that of another measure
- compare the index measure to the criterion measure, that was obtained at the same time
Forms of Criterion Validity-Predictive Validity
- the ability of an index measure to predict a figure outcome
- compare the index measure to the criterion measure that was obtained at a later point in time
Responsiveness to Change
- ability to a measure (instrument) to detect change in the phenomenon of interest
- depends upon…
- the fit between the instrument and the operational definition of variable=construct validity
- the number of values on the measurement scale: more values on the scale, the greater the opportunity to detect change
- standard error of measurement (SEM): extent to which observed scores are disbursed around the true score
Floor and Ceiling Effects
- floor effect: failure of a measure to detect lower scores for patients whose status has declined
- ceiling effect: failure of a measure to detect higher scores for patients whose status has improved
Threats to Research Validity
- subjects: selection, assignment, attrition, maturation, compensatory rivalry/resentful demoralization, diffusion or imitation of treatment, statistical regression to the mean
- investigators: compensatory equalization of treatments
- study logistics: history, instrumentation, testing
Threat-Selection
- problem: selection process leads to sample that is not representative of the population from which it is drawn
- potential solutions: multiple study sites, probabalistic selection
Threat-Assignment
- problem: group assignment process leads to unequal distribution of subject characteristics
- solution: probabilistic assignment
Threat-Maturation
- problem: changes over time that are internal to the subjects and may influence the outcome
- solutions: control or comparison group, repeated baseline measures, scheduling
Threat-Compensatory Rivalry/Resentful Demoralization
- problem: subjects change behavior in response to learning they are in control group
- solutions: masking, instructions re: adherence to protocol, separation of subjects
Threat-Diffusion or Imitation of Treatment
- problem: contact among subjects from different groups
- solutions: masking, instructions re: adherence to protocol, separation of subjects
Threat-Statistical Regression to the Mean
- problem: appearance of change due to an extreme score for the baseline measure on the outcome of interest
- solutions: trim outliers, average repeated baseline measures
Threat-Compensatory Equalization
- problem: purposeful or inadvertent supplementation of the control or comparison group
- solutions: masking, protocols for intervention administration, different locations
Threat-History
- problem: events occurring unrelated to the study that are out of the investigators’ control and may influence outcome
- solutions: control comparison group, scheduling
Threat-Instrumentation
- problem: wrong measurement approach or device, limitation of measure, malfunction, inaccurate application
- solutions: operational definitions, selection of most rigorous instrument, calibration, protocols, training and verification
Threat-Testing
- problem: appearance of improvement due to familiarity with test procedures or in response to different instructions/cues
- solutions: practice sessions, repeated measures, protocols for test administration, training and verification
Investigator Bias
- purposeful or inadvertent interference with the study’s procedures…
- selection
- assignment
- instrumentation
Statistical Adjustment
- method for controlling extraneous variables during statistical analysis
- used when design features cannot or do not successfully control confounding influences
Threats to Construct Validity
- construct under-representation
- experimenter expectancies
- interaction between different treatments
Study Relevance
- external validity (quantitative)/transferability (qualitative)
- extent to which results of the research study can be generalized: across groups, settings, times vs. to particular persons, settings, times
Threats to Study Relevance
- inadequate sample selection
- differences in settings
- differences in circumstances due to passage of time