2 - Measurement Flashcards
what is a discriminative instrument?
used to sort individuals into groups (ie based on who has criteria we want or not) eg diagnostic test, screening tool, methods of evaluating eligibility criteria
what are 2 ways to determine responsiveness? which is more common?
- anchor-based approach (more common) and distribution-based approach
what is more important reliability or validity?
- see answer to forum question
describe standard error of measurement
- estimate of the measure’s ability to differeniate among patients
- determines whether true change has occured
- closer to 0 is better
- ie looking at people who haven’t changed (a blood glucose reading says x, but that doesn’t men there is no error for this)
how do I make readers understand my results for comparing btw 2 groups?
- provide mean difference and 95% CI around that mean diff
- tell them the MCID ad whether MCID falls inside or outsde 95% CI (inside = inconclusive, outside = conclusive)
- provice number needed to treat (proportion of patients in experimental or control group who changed by an important amount)
what type of differences does mean difference look for? Is this a validity or reliability issue?
systematic differences (ie diff in the way people are measuring - a validity issue)
what is disease-specific HRQOL? examples?
- measures specific aspects of health (ie specific to the disease of interest)
- cant compare across clinical areas (only withing - for example which one offers more relief)
- easier to detect change bc questions are more specific
- eg: WOMAC (for patients w osteoarthritis)
define cost analysis
does not consider the effect of treatment
from chart: examines only costs but there is a comparison btw 2 or more alternatives
define reliability in terms of the formula
- a ratio of the true score to the true score plus its associated error
what is sensitivity to change
the ability to measure change
define face validity
- face value (patients)
- are questions asked reflective of what they experience with this particular disease?
describe the information about incremental cost-effectiveness quadrants
- upper left and lower right easiest to make a decision on

what is agreement
- how 2 things change according to each other (taking into account systematic differnces - y-intercept)
- good for reliability (btw 0 and 1, 1 being perfect association)
- ICC/Kappa
- can’t have more validity than reliability

what is internal consistency reliability? most common example? what should values be at?
- extent to which items on the questionnaire are associted w each other
- eg a correlation of 100% means if you answer yes to 1, will answer yes to the next etc - these q’s are redundant so take out
- values should be 80-90% (0.8/0.9)
- common example = chronbach’s alpha
How does one use a standard error of measure with a confidence interval? How to calculate 95% CI for score of 64, SEM 5.
SEM x 1.96 = 95% CI
SEM x 1.64 = 90% CI
SEM x 1.28 = 80% CI
- note the middle score is the z-value which is constant!
- 5 x 1.96 = +/- 10 and therefore 95% confident that score is btw 54 and 74
define pearson’s r, Interclass correlation coefficient (ICC), spearman’s rho and weighted kappa wrt association vs agreement and continuous vs categorical
pearson’s r: association, continuous
Interclass correlation coefficient (ICC): agreement, continuous
spearman’s rho: association, categorical
weighted kappa: agreement, categorical
what is criterion validity?
- predictive vs concurrent
- behaves as expected compared to gold standard (predictive/concurrent)
- the correlation of a scale with some other measure of the trait or disorder (ideally a gold standard or criterion measure) * gold standard needed for this!!
- predictive = administer new scale and see how well it predicts the event in the future
- concurrent = simultaneously administer the new scale with the criterion measure and determine the association
for the ICF (international classification of functioning, disability, and health), what are the 4 defining health areas? what are the modifiers?
1) body function: physiological/psychological (includes pain and mental disorder) 2) body structures: anatomical 3) activity: performance of a task or action 4) participation: involvement in meaningful, fulfilling, and satisfying activities contextual factors: age, coping strategies, social attitudes, education, experience etc (can modify ur health in any of these areas)
what is a predictive instrument?
used to predict the future (or result/product of the experiment) - measure something now that will predict something happening in the future an important validity indicator eg MCAT, LSAT etc
challenges: applicability (costs vary)

compare and contrast self-reported function and performance based measures
- both attempt to measure activity limitations
- performance: ie walk test, strength, ROM
- self-reported function: a patient reported outcome measure (PRO), more clinically relevent, eg: lower extremity functional scale
what are the 3 types of cost in the full economic evaluation?
- cost-effectiveness analysis
- cost-utility analysis
- cost-benefit analysis
why do we use surrogate outcomes?
- they increase efficiency
- easier faster and cheaper to measure
- describe the distribution-based approach for measuring responsiveness
- for people who aren’t expected to change (ie maybe chronic disease) - average of T1-T2 will be 0 (not expected to change)
- again measuring at 2 different time points
- plot distribution and decide cut-off point above which significant change has occured
- for people who are expected to change, same thing but this time arbitrary line is to left of bell curve
what is the tool’s metric?
- interpreting your results or making sure your results are interpretable to readers
define: precision
- a measure of the extent to which repeated measurements come up with the same value
- this is about the error - how much can you trust that the value is representative of the true score?
what does PRO stand for? example?
patient reported outcome measure
eg health releted QOL
what is a surrogate outcome? examples?
outcome measures that are not of direct practical importance but are beleived to reflect outcomes that are important
- it is indirectly important ot patients (they don’t care
- these outcomes arent perfect, can’t conclude it causes something
eg: cholesterol level
from what prespective can costs be represented as an outcome? (4) - what is the most common?
- individual
- ministry of health (most common)
- society (sick days etc)
- third-party power (insurance company)
* there is usually more than 1 of these views being represented
- describe STC wrt responisveness
- STC is a necessary but insufficient condition for responsiveness
- the problem with responsiveness is how are we going to determine/define what is clinically important?
- see lecture notes p 22, last slide
what are systematic errors a measure of?
validity
examples of continuous outcomes
- wieght, blood pressure, etc
what does it mean if your score exceeds MDC (CI = 95%)
- we can be 95% confident that a true change has occurred
- OR upon repeated assessments, 95% of stable patients will change by less than the reported interval
How do we use SEM to detect real change (ie change assessed over time)?
- to calculate difference btw present score and prevous score
- use Minimal Detectable Change (MDC) (aka smallest detectable difference)
- SEM x 1.96 X (√ 2) = MDC95
- then take difference in score (ie first score was 64, next was 80, so 16 diff) and compare w MDC95
what is a patient-important outcome? examples? what part of ICF is this related to?
- outcome measures that are of direct practical importance (patients consider them to be important)
- eg: survival, pain, PROs (patient-reported outcome measure) (eg QOL, functional ability)
- related to ICF activity/participation
what is health related QOL
an attempt to measure the broad concept of health (physical mental social)
define: cost effectiveness
- measurement of resource consumption and outcome of the intervention
- requires a common outcome btw interventions being compared
- eg effect per unit cost (life year gained per dollar spent), costs per unit of effect (cost per case detected etc)
describe pearson’s R in terms of whether or not it is a good measure of validity
- pearson’s r is good for validity (association) but it is not the best measure for reliability (precision/agreement)
What is construct validity?
- convergent vs discriminant
- like a mini-theory to explain the relationships among various behaviours or attitudes
- more abstract than criterion validity
- convergent = where a measure of constuct x correlates w other measures of the same construct (eg using participant observaition and a survey to assess anger) - change in the same way
- discriminant = a measure of constuct x does not correlate with measuements of dissimalar/unrelated constructs (eg measurement of age should not change in the same was as a survey measurement for anger) - predicting change in one instrument while the other stays the same
define accuracy
- a measure of how close a measurement comes to a true score for a variable
- ie how accurately a measure measures what you want it to
what is something that can greatly enhance instrument interpretability?
- knowing MID (minimally important difference)
- this is the smallest differnce in the score that informed patients have perceived as important, leading patients or clinicians to consider a change in management
inter- vs intra- rater reliability
- both = test-retest (need a time 1 and time 2 measurement - either by same person or diff person)
- inter = between 2 different people and how well they agree
- intra = measuring the same thing at diff times of the day for example
- describe mean difference
- systematic difference between groups (ie not at the individual level!)
- t-test will give us a p-value saying whther there is a statistically significant diff btw the 2 group means, but thats it
- closer to 0 is better
- ie take 100 patients and measure at t1 then colleague measures at t2 and compare results
what is validity
the extent to which an instrument measures what it is intended to measure
informing applicability - what is a sensativity analysis?
- substituting uncertainty in cost based on differences btw places (or a refelction about the uncertainty of the analysis - ie uncertainty around treatment effect)
- helps us to increase readership in terms of applicability
- uncertainty around many things, could be methods of administration, unsure about proportions of patients who will experience an adverse effect, etc
what terms go together: accuracy, precision, validity, and reliability?
accuracy = validity
precision = reliability
what is reliability
the extent to which an instrument yields the same results in repeated administrations in a stable population,
explain pros and cons of trying to improve precision with increased measurements (n-size)
pro: reduces the amount of random error in the study, narrows the CI
con: if experiment contains systematic errors (procedural or measurement), these are not corrected by increasing n-size, you are simply increasing your ability to reproduce a measurement of the wrong thing!
- describe the anchor-based approach for responsivness
- measure at T1 before anything has changed, then again at T2 after (ie at time 2 use original Q with the a global rating of change scale)
- can get idea of MCID (difference in averages at t1 and t2 for people who scored 2 or 3 on GRC
- can also give yourself some construct validity using this method
- for people who’s scores are the same, can’t use them for responsiveness but can use for reliability
- see notes pg 23, slide 1 and 2
define content validity
- representative of the content domains of the construct (experts)
- the same thing as face validity but from experts (more broad experience w disease as opposed to a single patient)
define cost minimization analysis
- between 3b and 4 on chart
- when the effect of treatment is similar across groups (no longer need ot consider effect bc we know its the same, so this is better than strict cost analysis)
what are spearman and pearsons r examples of?
ctiterion validity
- look at how strongly related or correlated 2 measures are (expected and new measure)
what is generic heath related QOL
- measures general health status, very vague, can span across diff medical conditions (can compare across diff states of health)
- relevent to all health states
- eg SF-12
what is an evaluative instrument?
used to evaluate change, can track change over time (must have properties that can detect change) - therapy studies use this
what are the 4 features of a good outcome measure?
- validity
- reliability
- sensitivity to change
- responsiveness
what are the 4 ways of measuring validity?
- face
- content
- criterion
- construct
another way of defininf costs = methods of evaluation - review chart!
- top left 2, if there is no comparison group
- bottom left = RCT
- bottom right = study with more than 1 group and also includes both cost and effectiveness
- rarely see just cost analysis

what is association
- how 2 things change according to each other
- good for validity (btw 0 and 1, 1 being perfect association)
- can’t have more validity than reliability

define: cost utility analysis - common measures
- the value you place on health benefits and avoiding poor health outcomes (measuring the value people place on certain health outcomes)
- how they would value avoiding another poor outcome - not direct, requires different measures
- can measure impacts of different interventions on different diseases
- common measure EQ5D (most common) or HUI or QALY - quality of adjusted life years (utility!)
difference btw kappa and weighted kappa
- kappa is dischotomous (categorical) and weighted kappa is ordered
what is responisveness
the ability to measure clinically meaningful change
what makes a surrogate endpoint valid?
- a causal relationship btw changes in the surrogate and changes in the patient important outcome (strongly predictave)
what are the common summary measures for reporting reliability? association or agreement for bottom 3?
- mean difference and standard error of measurement (for both the greater deviation from 0 the worse the agreement)
- pearson’s R (ass), interclass correlation coefficient (agree), and kappa/weighted kappa (agree) (range from 0 no agreement to 1 perfect agreement)
- what is sensitivity to change (STC)/how is it measured?
- often represented by standard response mean (SRM)
- administered before and after change in a population expected to change
- calculate mean change (T1avg-T2avg) over SD change (>1 = good)
- this is saying whether we can see the signal over the noice (>1) and if so, we have an instrument that is sensitive to change
ICF model of disability

define: cost benefit
- value of resources used up compared to those saved or created (eg willingness to pay)
- rare to use this one
what are dichotomous outcomes? disadvantages/advantages to this?
- referred to as events (dead/alive, healed/not healed, etc)
- disadvantage = can’t detect change easily
- advantages = easily interpretable (even without CI’s)