5010: Reliability, Responsiveness Flashcards

1
Q

Types of Measurement Error

A

Systematic
(consistent, unidirectional, biased, “constant”)

Random
(inconsistent, either direction equally likely, try to minimize, on average- will cancel out)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sources of Measurement Error

A

Rater (stabilization, recording)

Meas. Instrument or Method (goni-faulty, consistency- interrater)

Subject (clothing, m. mass, gender, time of day, meds)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Types of Reliability

A

Intrarater (usu. MOST reliable)
Interrater
Test-retest (suggests no rater involvement, self-reported data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
Intraclass Correlation Coefficient
(ICC)
A

True score variance b/w subjects
= ————————————————
Total variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Reliability Coefficient

A

True score variance b/w subjects
= ————————————————
Total variance + error variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variance

A

Measure of avg. variability of sample data. Ideally for a clinical measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Interpretation of Reliability Stats

A

Range: 0-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Statistical Measures of Reliability

A

ICC- for Continuous (sometimes Ordinal)
SEM for Continuous
Kappa for categorical
Cronbach’s alpha for Multiple items, one meas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Bland-Altman Plots

A

Plot difference between test-retest

Shows repeatability and any bias over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Validity

A

Are we really measuring what we think we are measuring?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

4 Categories of Validity testing

A

Face Validity
Content Validity
Criterion-related validity
Construct Validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Face Validity & How to Test

A

Does it appear to be valid for this measurement (subjective)

  1. Have clinicians look it over & give opinion
  2. Perform the test on patients & ask their opinion
  3. Clinician or Pt may reject it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Content Validity & How to Test

A

Does instrument address all aspects and only aspects of the attribute being measured?

  1. 1st author must give definition of what they intend to meas.
  2. Use a thorough, organized, comprehensive development process.
  3. May sample expert opinion
  4. Test for ceiling & floor effects
  5. Analyze data using factor analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Factor Analysis

A

Testing tool used to test CONTENT VALIDITY

  • used for multi-item meas. tools
  • complex statistical analysis based on correlation among items
  • will identify # and type of underlying dimensions being meas
  • may identify items that do not correlate/fit w/ other items

i.e: balance test –> might also identify strength –> but that is not what we are trying to measure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Criterion-Delated Validity & Test

A

this test requires a “gold standard” to serve as criterion.
Goniometry meas- gold standard is w/ x-ray

We don’t always use them because of cost, time, practicality, and the may be uncomfortable

2 Ways to Test:
1. Concurrent- gold standard used @ same time
2. Predictive- gold standard is some future event
(GRE to predict PT success)
To measure, calculate the correlation b/w the measurement and the gold standard.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Construct Validity

A

Used for abstract attributes that are difficult to define, where there is NO “gold standard”
Construct= logical argument/hypothesis about how a given means should behave (if it is measuring what you think it is measuring)
Researcher constructs argument, then test that argument via hypothesis testing.
NEVER QUITE FULLY PROVEN

17
Q

Theoretical Models (Framework)

A

helps define a variable by stating its relationship to other variables and phenomena

provides a basis for construct validity testing

i.e.: personal/environmental factors
Fear-Avoidance Model of Pain

18
Q

Construct Validity: Convergent vs. Discriminant

A

Convergent: correlation w/ other established balance measures (positive or negative)

Discriminant: No correlation with what we are testing.
(might correlate w/ another aspect like “cognition”, but if we are testing balance, it is discriminant)

19
Q

Look at Self-Check on pg 168 of course pack

A

:)

20
Q

Responsiveness

A

Ability of an instrument to detect clinically important change over time.

Essential for outcome measure when you expect to see progress in response to your treatment

21
Q

Things to ask…

A

Time Frame?

Function Measure Has Adequate Responsiveness?

22
Q

How do researchers meas/compare responsiveness?

A

I. Must have intervention &/or pt pop. where change is expected.
II. Must follow puts over sufficient time to see change
III. Calc. responsiveness for a # of outcome measures and compare.

Baseline———->Discharge Meas.s & Global Change Meas.s

23
Q

Global Change Measure /

External Criterion

A

Responsiveness involves detecting change in subjects who have actually changed.

In an sample, many pts improve, but some stay same or get worse.

Authors may use global change meas. to determine change status. (Improved/Same or stable/Worse)

24
Q

GROC

A

Global Rating of Change

A transitional measure

25
Q

Other Transitional Measures

A

How have your symptoms changed since you began treatment?

i. complete recovery
ii. much improved
iii. slightly improved
iv. no change
v. slightly worse
vi. much worse
vii. worse than ever

26
Q

Responsiveness Statistics

&

Advantage

A

Effect Size
Standardized Response Mean (SRM)

Advantage: Both statistics are unit-less, making comparisons b/w outcome measures easier (even if scales are different)

27
Q

Effect Size

A

Mean Change
= ————————
Baseline of sd

   (Final-Initial) =    -----------------------
   Baseline of sd
28
Q

Standardized Response Mean (SRM)

A

Mean Change
= ——————————
sd of change scores

        (Final-Initial) =    ------------------------------
   sd of change scores
29
Q

Interpretation of Cohen’s Effect Sizes

A

Effect Size &/or SRM*

.8 Large
.5 Moderate
.2 Small

*Both normalized (in sd units)

30
Q

Population

A
  • remember reliability, validity, and responsiveness are specific to a given population
  • can’t assume it will work in a diff. context
  • -> try to reference a study where population is similar to pt in your clinic care
31
Q

Practicality

A

Consider

  • time
  • expense
  • setting

May depend on context of your measurement
(research study, clinical practice)