Ch. 12: Critical Review of Tests Flashcards
Questions to consider: Broad issues
- What is the original purpose of the test? 2. What is the specific goal of administering the test? 3. Is the test being used in the manner in which it was intended? 4. For the purpose it was intended? 5. In the population for which it was initially designed and validated?
Questions to consider: More granular
: 1. Reliability 2. Construct validity of the test 3. Structural validity 4. Discriminability 5. Difficulty of questions
Impact on test scores and indices of performance(statistics)
Measurement statistics: mean, standard deviation, correlation, effect size, sensitivity, specificity; other statistics: Cronbach’s alpha, Kappa, Eigenvalues, Factor loadings
When selecting a test for clinical use
- Purpose of the assessment tool is identified
- Tester qualifications are explicitly stated
- Testing procedures are well explained
- Adequate standardization size
- Clearly defined standardization sample
- Evidence of item analysis
- Measures of central tendency
- Convergent validity
- Predictive validity
- Test–retest reliability
When selecting a test for clinical use: Purpose of the assessment tool is identified
- Diagnose the presence or absence of a disorder? 2. Determine the severity level of a known disorder? 3. Establish treatment goals and/or objectives?
Clinicians need to be aware that assessment tools might purport to serve a specific purpose, but offer no data to substantiate the validity of using a test for that rationale; If information related to the purpose of a test is not provided, the validity of the information collected using that tool might well be compromised.
When selecting a test for clinical use: 1. Tester qualifications are explicitly stated
: essential to the validity of a test, as any data collected cannot be considered valid if it is administered and/or interpreted by an unqualified individual (exactly what should you say when they respond differently than expected).
When selecting a test for clinical use: Testing procedures are well explained:
- Administering the assessment tool in a way that matches the presentation of the test to those in the standardization sample. 2. Any differences in how standardized assessment tools are administered yields scores that cannot be reliably compared to the normative sample.
- Quality of the data collected can be compromised, rendering test scores unusable for the purpose(s) they intended to fulfill
When selecting a test for clinical use: 1. Adequate standardization size
. Test scores that are compared to larger groups are more stable, and thus can be used more dependably in the clinical decision-making process. 2. Smaller sample sizes can also be indicative of a less representative sample for comparing scores, as with a small group included in the standardization pool it becomes doubtful that all possible subgroups of children (e.g. ethnicity, socio- economic status) have been included in a satisfactory manner, thus rendering the assessment tool in question unusable in many clinical settings
When selecting a test for clinical use: Clearly defined standardization sample
- Provide the following information relative to the normative sample: geographic representation, socioeconomic status, and the language status of those in the normative group (typical vs. atypical language skills).
- Information about the sample relative to the diagnostic purposes of the test (I.e. How many people in the sample meet the diagnosis of interest?)
Why would it be a bad idea to administer a test to a person who was not represented in the normative sample?
It may not mean anything for the population you’re giving it to because it was normed on a separate group.
When selecting a test for clinical use: Evidence of item analysis
- Item analysis is used to maximize both the reliability and quality of questions included 2. Looking at the content of individual questions, screening items for inclusion in the assessment tool 3. Ensuring that tests target the skills they purport to measure. 4. Factor structure supports the theory of the construct 5. Use of an assessment tool that fails to report data relative to item analysis could lead to clinical judgments being made on the basis of test questions that were poorly constructed.
When selecting a test for clinical use: Measures of central tendency
- Mean and standard deviation of all subtest scores for all groups of the normative sample 2. These measures are the basis for other scores that are derived for comparison of performance
When selecting a test for clinical use: Convergent validity
- Evidence demonstrating a correlation between results obtained from the test in question as well as other, similar assessment tools in indicating the presence or absence of the disorder. 2. Shows that results from a given assessment tool are more likely to be valid if a tool that assesses a similar construct has yielded analogous results.
When selecting a test for clinical use: Predictive validity
- Provide evidence that performance on a given test is predictive of performance observed in a more functional setting through direct observation or gold standard interview 2. Absence of predictive validity leads to uncertainty as to how assessment tools and real-life tasks can be compared. 3. Further, decisions related to intervention planning could be compromised as a result of a lack of reliability evident in test scores collected from such instruments.
When selecting a test for clinical use: Test–retest reliability
- Ensure that scores attained on a given test are stable over time.
- Time interval should be considered based on the construct 3. Is this construct supposed to be stable in a week, in a month, a year? When relevant: Inter-examiner reliability ensures that test scores do not fluctuate when different clinicians administer the test battery