Quanti - Reliability and Validity Flashcards
Trustworthiness strategies in quantitative research
(the criterion - the test)
truth value - internal validity
applicability - external validity
consistency - reliability
neutrality - objectivity
What is measurement error?
Distortion in measurement related to the effects of extraneous factors.
What is the obtained score and how do you calculate it?
Obtained score = True score ± Error
Obtained score: an actual data value for a participant (e.g. participant’s score on an anxiety scale)
True score: the score that would be obtained with an infallible measure (caused by independent variable)
Error: caused by extraneous factors that distort measurement
What are the common factors that lead to error of measurement?
Situational contaminants
Response-set biases
Transitory personal factors
Error of measurement - situational contaminants
Scores can be affected by the conditions under which they are produced.
- the friendliness of researchers
- the location of the data gathering
- environmental factors:
e.g. temperature, lighting, time of day
Error of measurement - response-set biases
are potential problems in self report measures, particularly in psychological scales
a temporary reaction to a situational demand
- Social desirability– expected public disclosure
- Acquiescence response- e.g.= time pressure
- Extreme responses– e.g., select extremity
responses almost exclusively
What is social desirability bias (SDB)?
In self-reports, people will often report inaccurately on sensitive topics in order to present themselves in the best possible light
What is an acquiescence response?
A tendency to agree (yea-sayers)/ disagree (nay-sayers)
with all statements, regardless of the content.
Error of measurement - transitory personal factors
A person’s score can be influenced by such temporary personal states as fatigue, hunger, anxiety, or mood.
In some cases, such factors directly affect the measurement, e.g. when anxiety affects a pulse rate measurement.
How do we reduce the error of measurement?
Train interviewers thoroughly so that they aren’t inadvertently introducing errors
Collect data at the similar place and time
Assure the questionnaires will be anonymous.
Assure the participants will be given enough time.
Assure the participants are mentally and physically ready for the assessment
Interviewer has to check if the participants are in unusual mood/states.
What is reliability?
How CONSISTENTLY a data collection instrument measures the variable that it is supposed to measure.
The consistency with which an instrument measures the target attribute.
What is validity?
Whether the data collection instrument measure the variable that it is supposed to measure
Using an anxiety scale for pain makes it INVALID
What are the 3 aspects of reliability?
Stability
Internal consistency
Equivalence
What is stability?
The extent to which scores are similar on two separate administration of an instrument
Assessed through TEST-RETEST RELIABILITY procedure = repeat to see the difference between the values obtained
What is the reliability coefficient (r)?
A numeric index that quantifies an instrument’s reliability can be computed.
<0.7 - unsatisfactory
0.7-0.8 - acceptable
>0.8 - desirable
What is the most popular statistical procedure used?
Intraclass Correlation Coefficient (ICC)
What is internal consistency?
- The extent that all the subparts of the instrument measure the same trait.
- Appropriate for most multi-item instruments.
- Evaluated by administering instrument on one occasion.
- The most widely used reliability approach
How do we evaluate internal consistency?
Cronbach’s alpha
range from 0.00-1.00
0.7-0.9 - acceptable
too low - items are measuring different traits
too high - redundancy of items
What is Cronbach’s alpha and what does it indicate?
Alpha indicates how well a group of items together measure the trait of interest (internal consistency).
If all items on a test measure the same underlying dimension, then the items will be highly correlated with all other items.
What is equivalence?
Concerns the degree to which two or more independent observers or coders agree about the scoring on an instrument.
Assessed by comparing observations or ratings of two or more observers.
A high level of agreement between the raters indicates a good equivalence of the instrument.
How is equivalence assessed?
Assessed through Inter-rater (interobserver) reliability procedure: which having two or more trained observers/coders watching an event simultaneously, and independently recording data according to the instrument’s instructions.
An index of agreement is calculated.
* Cohen’s Kappa (k) is used to measure inter-rater reliability for categorical outcomes (< 0.6)
* Intraclass Correlation Coefficient (ICC) is used to measure inter-rater reliability for continuous measures (≥ 0.7)
What are major aspects of validiity?
Content validity
Criterion-related validity
What is content validity?
Concerns the degree to which an instrument has an appropriate sample of items for the construct being measured.
Adequacy of content of the instrument in providing full coverage of the concepts of interest.
How do we assess content validity of an instrument?
An instrument’s content validity is necessarily based on judgment by an expert panel.
Have experts rate items on a four-point scale
1 = not relevant 2 = somewhat
3 = relevant
4 = very relevant
A formal content validity index (CVI) across the experts’ ratings of each item’s relevance.
The CVI for the instrument is the proportion of items rated as either 3 or 4. A CVI score of 0.90 or better indicates good content validity.
What is criterion-related validity?
The extent to which an instrument is corresponding to some external criterion of the variable of interest.
External criterion:
* A gold standard or well-established valid measure for the variable of interest.
What is concurrent criterion-related validity?
To reflect the same incident of behavior with a criterion measure at the same time.
Administer the tested instrument together with the criterion measure.
e.g. If tested instrument: Cardiac Depression Scale (CDS)
Use Beck Depression Scale (as an external criterion) to measure depression level of subjects
Concurrent validity= correlation between CDS and BDS, correlation coefficients (r) of 0.70 or higher are desirable
What is predictive criterion-related validity?
To predict subjects’ responses in the future
Criterion measure is used to assess subjects’ response in the future.
e.g. If tested instrument: Risk for malnutrition development
step 1: rate subjects on the tested instrument
step 2: follow-up subjects for the development of malnutrition
step 3: predictive validity is the correlation between instrument score and development of malnutrition (BMI)
step 4: correlation coefficients (r) of 0.70 or higher are desirable
What is the difference between predictive and concurrent validity?
Difference lies in the timing of obtaining measurements on a criterion.
Predictive validity: obtains measurements in the future (future criterion)
Concurrent validity: obtains measurements at the same time (present criterion)
What is internal validity?
Are you measuring what you intended to measure?
What is external validity?
Generalisability of the research findings