8.1 Reliability and Validity Flashcards
Reliability
- Ability of an instrument to measure the attributes of a variable or construct consistently
- Consistently good and able to be trusted
(CONSISTENCY)
Validity
- Extent to which an instrument measures the attributes of a concept accurately
(ACCURACY)
Example
Valid and reliable means all arrows will point towards the middle of a dart board
Reliable but not valid the arrows will still be clustered together, but away from the middle of a dart board
Measurement Error
- In the real world virtually all scores contain some errors of measurement
- Actual scores are always different than hypothetical true scores.
Variability (Differences)
- Differences in scores that is attributable to error
Random Chance Errors
- Affects RELIABILITY
- Errors that are difficult to control during the time of testing, unsystematic in nature, occur as a result of state or transient characteristics, and are often beyond the awareness and control of researcher.
Example
- Subjects are tired or hungry
- The environment is noisy
Systematic (Constant) Errors
- Affects Validity
- Errors attributable to stable characteristics in a study that might bias behavior or cause incorrect instrument calibration
Example
- Scale is not calibrated correctly and adds 2 pounds to all subject weights.
Validity
- The greater the validity of an instrument (measures what it’s purpose is to measure) the more confidence you can have that it will answer a research question or hypothesis
- Validity is rarely reported in articles.
Statistical Articles
- Statistical procedures ensure reliability
- Validity is established with use of panel of experts or examination of current literature
Examples
- Usually articles will just state that an instrument is valid and reliable for use in “subject”
- An exception to this is when researchers are preforming initial psychometric testing on a newly developed instrument. More details are required here.
Categories of Validity
Content (Face) Validity
Criterion Related Validity
Construct Validity
Video
- You can have reliability without validity, but you cannot have validity without reliability
Content Validity
- Used when you want to know if a sample truly reflects an entire universe of possibilities on a certain topic.
Example
“I want to design a test to measure my students ability in statistics”
- You want to make sure your study covers all the content required for your study
Criterion Validity
- Used when you want to assess whether a test reflects a set of abilities in a current or future setting.
- The degree to which a subjects performance on an instrument and the subjects actual behavior are related
Concurrent Criterion Validity
- Does my test accurately assess current situation.
(If my professor covered 95% of the course topics, do you get a 95% on the final exam)
Predictive Validity
- Does this test accurately assess future results.
(If you get a 95% on the final exam, will you do good on future exams)
Construct Validity
Construct - Group of interrelated variables that we care about measuring.
Construct Validity
- Degree to which our scale measures the constructed claims.
- AM I REALLY MEASURING WHAT I AM AIMING TO MEASURE
- Does my scale correlate with actual outcomes
- STRONGEST WAY TO MEASURE VALIDITY
Reliability
- If the same or comparable instruments were used on different occasions to measure a set of behaviors, would similar results be expected.
Categories of Validity
Stability
- Does the instrument produce the same results with repeated testing
Homogeneity (Internal Consistency)
- Do all the items on the instrument measure the same concept, variable or characteristic
Equivalence
- Does the instrument produce the same results when equivalent or parallel instruments or procedures are used.
Categories of Reliability
Test/Re-Test Reliability
- Is the test reliable over time. Correlation between how people score the 1st time they take the test and the 2nd time they take the test. (If scores differ greatly between the 1st and 2nd test, the instrument is not reliable)
Parallel Forms Reliability
- Examine the equivalence between 2 forms of the same test. Give someone Form A and then later give Form B. These are 2 different forms (different then test-retest reliability) and see how similar results are between the 2.
Interrater Reliability
- Used when you want to know how much 2 raters agree on their judgment of an outcome of interest. During observation of something, you have multiple people observing the same thing incase something is missed by just 1 person. Do the people who are observing something agree with what is happening.
Internal Consistency
- Whether items on a test or scale are consistent with each other. (They measure 1 and only 1 thing) - Measures the reliability of a scale.
Example
- An anxiety test where questions both test anxiety and depression is not good internal consistency. You want all questions to measure anxiety.
Reliability Coefficient
- Measures the consistency of scoring
- Reliability co-efficient must be greater than 0.70 to be considered reliable and above 0.90 to be considered acceptable for a clinical instrument.
Cronbach’s Alpha scale is used to calculate reliability coefficient
Questions
- Simply stating that a tool is valid is not considered reporting validity data of an assessment tool
- Reliability however can be seen in an article by using the Cronbach’s Alpha scale and informing the coefficient of the tool. (Example 0.86)