Chapter 6: Psychometrics, Test Design, and Essential Statistics Flashcards
When performance on a test is not normally distributed but there is some variance, what is the best way to interpret the test result using normative data?
A) z score
B) T score
C) cut-off
D) percentile
D - percentile
Use of standardized scores that assume a normal distribution are not appropriate when there is not a normal distribution in the normative data, as occurs when most ppl do relatively well on the test. Thus, interpretation of either a z or T score would be inappropriate.
Although a cut-score could be used, dichotimizing the sample into intact or impaired results in loss of some of the critical measurement meaning of the score, especially when scores fall very close to the cut-off.
Thus, use of a percentile distribution is most appropriate b/c this at least tells the clinician the proportion of the sample that did as well or worse than the patient of interest.
An individual is given a battery of tests with at least three tests in each of five cognitive domains. He performs in the mildly impaired range on one test in each of two separate cognitive domains. How do you interpret this pattern of performance?
A) The pt is clearly impaired in two important cognitive domains; I diagnose him accordingly and provide treatment recommendations in my report.
B) The pt is essentially intact in almost all cognitive domains; I make no diagnosis and clarify in my report that no treatment is deemed necessary.
C) The pt may be impaired in one or more domain; I need more tests to be sure and will send a request for that in a report to the insurance company.
D) This may be due to normal variability. Unless a disorder is otherwise indicated by history, I make no diagnosis but comment on the variability in my report.
D - This may be due to normal variability.
Unless a disorder is otherwise indicated by history, I make no diagnosis but comment on the variability in my report.
On Test 1, an individual obtains a T score of 65. On Test 2, she obtains a scaled score of 115. On Test 3, the individual scores in the 67th percentile. On Test 4, the individual’s score is equivalent to a z score of 0.5. Which of the following is the correct ordering of these scores, from lowest to highest?
A) Test 4, 3, 2, 1
B) Test 3, 4, 2, 1
C) Test 4, 2, 3, 1
D) Test 3, 2, 4, 1
B - Test 3, 4, 2, 1
Converting each score to one common metric will allow for their comparisons.
Percentiles: T1 = 93%ile T2 = 84 T3 = 67 T4 = 69
The concept of regression to the mean is best expressed by which of the following?
A) The apple doesn’t fall all that far from the tree.
B) Highly intelligent ppl have even smarter children.
C) Ppl of superior intelligence are likely to have high average children.
D) Ppl who are low average are at high risk of having impaired children.
C - People of superior intelligence are likely to have high average children
Answers B & D - imply regression AWAY from the mean
Answer A - indicates no systematic expected change in predictions with repeated measurement.
You desire to calculate a confidence interval around an obtained test score to determine how confident you are that the obtained score reflects the person’s true ability in this domain. For the most appropriate estimate of the confidence interval, you should use the following in your calculation:
A) Validity coefficient for the test.
B) Standard error of the mean.
C) Standard error of the estimate.
D) Standard error of the measurement.
C - Standard error of the estimate
As stated in the section of SEM/SEE, the SEE is based on the obtained score and requires no knowledge of the true score and includes extra consideration of the reliability of the test.
You have created a new test that you want to use clinically, but its psychometric properties are unknown. You want to know about the incremental validity of the test, and you have knowledge of the base rate of the condition that the test was designed to detect. To calculate the incremental validity if you are equally worried about false positives and false negatives, you need to know the test’s _____.
A) overall hit rate
B) sensitivity
C) specificity
D) positive predictive value
A - Overall hit rate
Although positive predictive value could be used to calculate incremental validity,it is only useful if we are interested in our test’s incremental ability to make a positive diagnosis, not as an indicator for overall diagnostic accuracy.
Thus, the overall hit rate is the best choice in the situation b/c we are interested in both yes and no decisions based on the test (both PPV and NPV).
Which of the following will NOT affect the reliability of an intra-individual difference score (i.e., the reliability of difference in performance between two tests within one individual)?
A) reliability of test one
B) correlation between the tests
C) variance of the distribution of the difference scores
D) actual difference between the two scores
D - Actual difference between the two scores
The reliability of the tests, the association between them, and the error variance are all included in the formula required to calculate the reliable chance interval. The actual difference between the score is only a point of reference and is compared to the calculated.
Which of the following is not a well-validated clinical use of regression?
A) prediction of premorbid ability level
B) prediction of membership in a clinical group
C) prediction of performance in one domain based on performance in others
D) prediction of future test performance based on past test performance
C - prediction of performance in one domain based on performances in others.
Answer C is INCORRECT b/c performance within each domain is believed to be relatively independent from other domains and there is little basis for prediction of performance in one domain on the basis of others.
Two individuals are administered the same test. Person 1 scores in the 48th percentile; Person 2 scores in the 93rd %ile. It is later found that there was an error in scoring of the test on these two administrations only, and 3 points are then added to each person’s score. Given this information, which of the following is true?
A) Both percentile ranks will increase by the same amount.
B) Person 1’s percentile rank will increase more than Person 2’s
C) Person 2’s percentile rank will increase more than Person 1’s
D) Neither percentile rank will change
B - Person 1’s percentile rank will increase more than Person 2’s
both individual’s scores will change in reference to the normative group b/c of this additive. However, b/c of the assumption of normal distribution, score differences in the middle of the distribution of percentiles are exaggerated compared to those at the extremes. Thus, changing a raw score by 3 points will have a larger influence on the percentile ranking close to the middle of the distribution.
Assuming a normal distribution, how many people would score between a 600 and a 900 on a standardized test with a mean of 750 and a standard deviation of 150 (N-1000)?
A) 840
B) 680
C) 640
D) 720
B - 680
500 falls 1 SD below the mean; 900 falls 1 SD above the mean.
68% of a normally distributed sample falls between +/- 1 SD from the mean.
Thus, in a sample of 1000, 680 (68%) fall +/-1 SD from the mean.
TRUE or FALSE
A finding of one or more impaired scores in a relatively large battery is RELATIVELY COMMON in normative samples without neurological impairment.
TRUE
Unless the findings fit a profile that is consistent with an impaired domain or expected impairment based on medical history/presumed etiology (i.e., variability across scores in ADHD), the findings should not be OVER-INTERPRETED but considered in this light and discussed as possible normal variance in the interpretation section of the report.
DEFINITION:
Alpha
The probability of type I error in making a decision about the tenability of a null hypothesis (“False Positive”)
A measure of a test’s reliability (coefficient alpha) that reflects the internal consistency of the item.
DEFINITION:
Alternate forms
aka parallel tests
Tests constructed to be similar in content, high in reliability, and equivalent.
DEFINITION:
Baye’s theorem
Probability/statistics theorem employed in decision analysis to allow the posterior probability of an event to be calculated.
DEFINITION:
Beta
The probability of making a Type II error in statistical hypothesis testing (“False Negative”)
DEFINITION:
Central Limit Theorem
If n independent variates have finite variances, then standard expression of their sum will be normally distributed (as n approaches infinity)
DEFINITION:
Conditional Probability
The probability of an event or outcome, given that a difference even has occurred.
Based on Bayes’ Theorem
DEFINITION:
Confidence Interval
Interval around a statistic (i.e., observed test score, sample mean), usually expressed in SD units or percentages, that reflects the expected sample-to-sample variability
DEFINITION:
Content Validity
Degree to which scores on a measure capture all the aspects of a dimension of interest. Can be demonstrated by parallel validity.
DEFINITION:
Construct Validity
Degree to which scores on a measure support inferences about a dimension of interest. Can be demonstrated via factor analysis or other methods that illustrate convergent and discriminant validity.
DEFINITION:
Descriptive Statistics
Show the main features of the data involving the central tendency, variability, and the shape of the summarized data points
DEFINITION:
Discriminant Analysis
the process of utilizing a score profile to determine whether an individual belongs to one group (i.e., a specific diagnosis) or another (i.e., no diagnosis or a different diagnosis).
can also be used to describe differences btw two or more groups on a set of measure (DESCRIPTIVE discriminant analysis) or to classify subjects into groups on the basis of a set of measures (PREDICTIVE DA)
DEFINITION:
Ecological Validity
The degree to which a measure predicts behavior in everyday situations;
a form of external validity
DEFINITION:
External Validity
Degree to which results from a particular test or measure can be generalized to situations or related to information beyond the test itself (correlation of measure to another measure of some independent criterion
DEFINITION:
False Negative
(aka Type II error or beta error)
Error that occurs when a test incorrectly indicates the absence of a particular trait or condition when the trait or condition actually exists.
Funny Ex: Telling a obviously showing pregnant woman she isn’t pregnant
DEFINITION:
False Positive
(aka type I error or alpha error) Error that occurs when a test incorrectly indicates the presence of a trait or condition when none genuinely exists
Funny Ex: Telling a man he is pregnant
DEFINITION:
Inferential Statistics
Methods used to reach conclusions that extend beyond the immediate data alone to extend to wider samples and conditions
DEFINITION:
Internal consistency
Estimate of the reliability of a measure or score based on the average correlation among items within a test.
The size of the coefficient depends on both the average correlation among the items AND the number of items; represented by coefficient alpha.
DEFINITION:
Item characteristics curves
(aka item response function) Shows probability of a correct response as a function of the level of overall performance of the person.