A type of Reliability Generally used with questionnaires, written exams, and interviews (more for qualitative research) Use correlations among all items in the scale Want to see some relationship among the items on an exam, interview....as they should measure the same attribute Reliability coefficient: correlations

Reliability coefficient: intraclass correlation coefficient (ICC) The reliability coefficient for Rater Reliability tests (both Intra- and Inter- rater)

Systematic Errors: predictable errors of measurement Consistently overestimates or underestimates the true score Constant and biased More of a problem with validity than reliability (Systematic error is a reliable error: For example, an uncalabrated scale is always the same amount off each time you measure it. It causes problems with validity but not reliability.)

Establishes that the outcome of a test can be used to predict future score or outcome ie: GPA used to predict success in PT school or BERG balance test used to predict falls Criterion and target test are tested at different times Target test = new test that is untested so far

Reliability Versus Validity Lecture Dr Wofford Flashcards by Sara Morris

Test-Retest Reliability

Coefficient: test-retest reliability coefficient
Common with self-report survey instruments
- ie: a subject takes an identical test on two different occasions under identical testing conditions
Considerations:
- Test-retest intervals
- Carryover and testing effects

Very important to talk about the Test-retest intervals. Like how long since we last took the test. (like I probably cannot take the cognitive tests Dr. Esmat used in her study for a long time if ever)

Testing effect. Did I do better because I learned the test or because I improved?

Carryover and testing effects.

How well did you know this?

Not at all

Perfectly

Internal Consistency

A type of Reliability

Generally used with questionnaires, written exams, and interviews (more for qualitative research)
Use correlations among all items in the scale
Want to see some relationship among the items on an exam, interview….as they should measure the same attribute
Reliability coefficient: correlations

How well did you know this?

Not at all

Perfectly

Criterion-Related Validity

Most practical and objective approach to validity testing
Ability of one test to predict results on an external criterion
High correlation indicates the test is valid based on the external criterion
External criterion must be valid, reliable, independent and free from bias (Must make sure the gold standard is really a gold standard if you are going to use it for this)
- May also be called reference standard or gold standard or
Can be tested using concurrent or predictive validity

How well did you know this?

Not at all

Perfectly

ICC

Reliability coefficient: intraclass correlation coefficient (ICC)
The reliability coefficient for Rater Reliability tests (both Intra- and Inter- rater)

How well did you know this?

Not at all

Perfectly

Relationship between validity and reliability

Validity implies that a measurement is relatively free from error
- Inherently means that a valid measurement is also reliable
A test can be reliable, but not valid
A test cannot be valid, but not reliable

How well did you know this?

Not at all

Perfectly

What is the validity counterpart to internal reliability?

Content Validity

How well did you know this?

Not at all

Perfectly

Random Errors

Random errors: Due to chance and can affect scores in unpredictable ways

Decrease random errors= increase reliability
Reliability focuses on amount of random error a measurement has
Example: fatigue

How well did you know this?

Not at all

Perfectly

Three main Types of Reliability

Test-retest reliability: Stability of the measuring instrument
Rater reliability: Stability of the human observer
- Inter-rater versus Intra-rater
Internal consistency: extent to which items measure various aspects of the same characteristic and nothing extraneous
- more for when using questionnaires

How well did you know this?

Not at all

Perfectly

Inter-rater reliability

Inter-rater reliability: variation between 2+ raters who measure same subject

Best if all raters measure a response during one trial
Ensure blinding of other assessors

Reliability coefficient: intraclass correlation coefficient (ICC)

How well did you know this?

Not at all

Perfectly

Three Sources of Measurement Error

Individual taking the measurements
- Called tester or rater reliability
Measuring instrument introduces error
Variability of the measured characteristic

How well did you know this?

Not at all

Perfectly

Validity: Convergence and Discrimination

Convergent validity: two measures believed to reflect the same underlying phenomenon will have similar results or correlate highly

Implies that the theoretical context behind the construct will be supported when the test is administered to different groups in different places at different times

Discriminant validity: different results (low correlations) are expected from measures which are believed to assess different characteristics (Discriminant validity is when two measures that measure different things should not correlate well.)

Construct validity is related to convergent and divergent

How well did you know this?

Not at all

Perfectly

Four Types of Measurement Validity

Face validity
Content validity
Criterion-related validity
- Concurrent validity
- Predictive validity
Construct validity

How well did you know this?

Not at all

Perfectly

Systematic errors:

Systematic Errors: predictable errors of measurement

Consistently overestimates or underestimates the true score
Constant and biased
More of a problem with validity than reliability

(Systematic error is a reliable error: For example, an uncalabrated scale is always the same amount off each time you measure it. It causes problems with validity but not reliability.)

How well did you know this?

Not at all

Perfectly

Predictive Validity

Establishes that the outcome of a test can be used to predict future score or outcome
- ie: GPA used to predict success in PT school or BERG balance test used to predict falls
Criterion and target test are tested at different times

Target test = new test that is untested so far

How well did you know this?

Not at all

Perfectly

Responsiveness to Change

Responsiveness: ability of an instrument to detect minimal change over time
Used to assess the effectiveness of interventions
Minimal clinically important difference (MCID): smallest difference in a measured variable that signifies an important difference in a subject’s outcome
Statistical versus clinically meaningful change

How well did you know this?

Not at all

Perfectly

Intra-rater reliability

Study These Flashcards

Intra-rater reliability: stability of data recorded by one individual over 2+ trials

Rater bias: when raters are influenced by their memory of the first score
Best to blind tester

Reliability coefficient: intraclass correlation coefficient (ICC)

Face Validity

Study These Flashcards

Least rigorous form of validity
Instrument appears to test what it is supposed to test
- ie: ROM, strength, sensation, gait, balance
Considered subjective and scientifically weak

Concurrent Validity

Study These Flashcards

Establishes validity when two measures are taken at the same time. One measure is used as the gold standard
Both reflect the same incident of behavior
Commonly used with diagnostic or screening tests for determining presence or absence of a disease
Also used with a new or untested measure may be more efficient than a more established method

Concurrent is if the two measures are taken at the same time (gold standard and comparison test) when testing the criterion-related validity

Also used with a new or untested measure (target test) may be more efficient than a more established method – IntegNeuro

Measurement error

(include formula)

Study These Flashcards

Relates to Reliability

Measurement error: difference between observed and true scores

Observed score - true score = error
Reliability is an estimation of how much of a measurement represents error and how much is the true score

(The formula she had was observed score = true score – error, but she keeps talking about that measurement error is the difference between observed score ant true scores (so I changed the formula)

Reliability

Validity

Study These Flashcards

Reliability: How consistent and free from error is the instrument.

Reliability is an estimation of how much of a measurement represents error and how much is the true score

Validity: Does the test measure what it intends to measure

Reliability Coefficient (formula and interpretation)

Study These Flashcards

Reliability coefficient= True score variance/(true score variance + error variance)

Ranges from 0.00-1.00.
These numbers are arbitrary, but they are used in literature. As a researcher, we have to decide what level of reliability is reliable or not.
- .50-.75= moderate reliability
- >.75= good reliability

Zero is poor reliability (0% reliable à none is attributal to true difference)

One is the best reliability (100% reliable à all is attributed to true difference)

Reliability Coefficient

Study These Flashcards

Estimate reliability based on statistical concept of variance

Measure of variability or differences among scores within a sample
Some variance is attributed to true differences among scores and some is attributed to random error
- Difference among the scores will all of us (if we were going to have a pop quiz today). Random error Variance would be from how much we slept last night. Then results of the test that reflects what we acually remembered (instead of bad answers because of sleep deprivations) would be the true difference Variance.

Reliability= how much of total variance is attributed to true differences between scores

Rater Reliability

Study These Flashcards

Intra-rater reliability: stability of data recorded by one individual over 2+ trials

Rater bias: when raters are influenced by their memory of the first score
Best to blind tester

Inter-rater reliability: variation between 2+ raters who measure same subject

Best if all raters measure a response during one trial
Ensure blinding of other assessors

Reliability coefficient: intraclass correlation coefficient (ICC)

Intra-rater –> same person

Inter-rater –> between more than one person

Must be same subject being tested and same setup

Test-retest is looking more at the test (or the person taking the test), whereas Intra-rater reliability would be more about the person giving the test.

Rater bias: see how much they got before and try to get to the same value the second time. (overcome by blinding tester to the outcome of the test).

Regression Towards The Mean

Study These Flashcards

Observed scores move closer to the mean with repeated tests
A phenomenon which occurs more with outliers and increased random error
Example: extreme high and low scores on a pre-test which may not indicate true knowledge move closer to the class average on the post-test
More of a problem with less reliable test

MCID

Minimal clinically important difference (MCID): smallest difference in a measured variable that signifies an important difference in a subject’s outcome Effect size is being replaced by MCID when calculating power (for power analysis), because we want to know if it will be helpful for our patient.

Two Types of errors (not type I or type II):

Systematic errors: predictable errors of measurement * Consistently overestimates or underestimates the true score * Constant and biased * More of a problem with validity versus reliability Random errors: Due to chance and can affect scores in unpredictable ways * Decrease random errors= increase reliability * Reliability focuses on amount of random error a measurement has * Example: fatigue

Content Validity

* Adequacy with which the universe (theory) is sampled by a test/instrument * An instrument must cover all content and reflect the relative importance of each item * Commonly used with questionnaires and inventories * ie: a test of gross motor skills should not contain items pertaining to language skills * Commonly used with questionnaires and inventories * The validity counterpart to internal reliability.

What type of data is mean associated with?

normally distrubuted data (parametric statistics appropriate) If we have normally distributed data, we get a bell-shaped curve. If we have a bell shaped curve, we can take the mean (the average of the curve) If we do not have a bell shape curve, it is not normally distributed and we cannot use the mean. We must take the median.

Contstruct Validity

* Ability of an instrument to measure an abstract concept or construct * Difficult to determine since a construct is abstract * ie: Health, pain * Based on content validity * Must be able to define the content universe that represents the construct Construct = abstract concept Have to have content validity to be able to determine construct validity Related to convergent and divergent – IntegNeuro

what type of data is median normally associated with?

non-normally distributed data (nonparametric statistics) ## Footnote If we have normally distributed data, we get a bell-shaped curve. If we have a bell shaped curve, we can take the mean (the average of the curve) If we do not have a bell shape curve, it is not normally distributed and we cannot use the mean. We must take the median. If she says mean, we should infer that we are talking about normally distributed data. (parametric statistics appropriate) If she says median, we should infer that we are talking about non-normally distributed data. (non-parametric statistics appropriate)

Reliability Versus Validity Lecture Dr Wofford Flashcards

(30 cards)