Utility Flashcards
According to article, what’s wrong with traditional indexes like specificity, sensitivity, PPV, and NPV?
They vary based on cutoff scores, base rates, and cost/benefits with a particular context or assessment.
What do the article authors encourage to do?
Use methods of signal detection theory, like ROC for utility
What happens when cutoff scores change? Which of the following are affected? (Selection Ratio, Base Rate, Sensitivity, Specificity, Positive Predictive Power, Negative Predictive Power)
Only base rate does not change.
What metrics are directly affected by the base rate?
PPV and NPV
What is the ROC-based index is independent of?
1) base rate / prevalence
2) cutoff score
3) values of the 4 decision-making outcomes (i.e., TN TP FN FP)
How to use ROC analysis?
Look at area under ROC curve. This shows discriminative power of an assessment independent of cutoff value
What does it mean when AUC = 1?
What about when AUC = 0.5 or less than 0.5?
When AUC = 1, the model is perfect across all cutoffs. Sensitivity and Specificity both are 100%
When AUC = 0.5, it’s no better than guessing.
When AUC < 0.5, the test inverts values
Test Utility
Usefulness or practical value of testing to improve efficiency
3 factors for utility of a test (what makes it have high/low utility?)
1) psychometric soundness (validity + reliability)
2) costs
3) benefits
Does utility require measuring something?
No, there can be utility without measurement
Is utility an empirical or conceptual endeavour?
Empirical
How are costs complicated?
It’s complicated to calculate costs due to nonmonetary costs. (eg. what are the costs of not administering a test, resulting in possibly failing to diagnose a disease?)
What general procedure determines costs and benefits of a test?
Utility analysis
Expectancy data
One way of doing utility analyses. Uses predictions of outcomes to judge the benefits of a test?
What does a Taylor-Russell table accomplish?
A type of expectancy data.
Helps to see how much using a test will improve selection over existing method
Naylor-Shine tables
Difference in means between selected and unselected groups to see what a test adds (increase in avg score)
What is a Taylor-Russell table based on?
Selection Ratio and Validity estimate
What formula is used when assessing costs and benefits of a test?
Brogden-Cronbach-Gleser formula.
Estimates utility gain based on dollar amount.
Higher selection ratio means probably higher or lower false positives?
More false positives, less false negatives.
Why is the pool of job applicants a practical consideration when determining utility of a test?
Utility analyses assume that there exists an unlimited supply of the type of people who is qualified according to the test.
Why is the fact that some people won’t accept job offers a practical consideration for utility of a test?
It will overestimate utility if not considered. Some people will deny job offers. 80% of utility is a better estimate.
Alternative name for relative cut score
Norm-referenced cut score
Fixed vs. Relative cut scores
a specific number vs. a distribution or percentile.
Multiple hurdle or multistage selection process
What’s multiple cut scores and how does that differ?
Ensures a tester must have a few predictors. Multiple cut scores
Multiple cut scores can refer to different categories of same predictor (eg. A, B, C, D grades have multiple cut scores)
Compensatory model of selection
what does this compare with?
A high score on one predictor can balance out a low score of another
Compared to a multiple hurdle/multistage process.
What statistical tool for compensatory model of selection?
Multiple regression
Angoff Method
A method to determine cut scores.
Average experts judgements on expectations.
What can show that Angoff method might not be best?
What are its weaknessess
Low interrater reliability (if the experts disagree).
Lack of data-driven techniques, more subjective.
Known Groups Method
(alt name)
Method of contrasting Groups. A way to determine test scores.
Take scores of those known to possess and not possess some predictor.
What’s a problem with known groups method?
The choosing of groups are arbitrary
IRT-based methods
A way to determine cutoff scores.
Each item has some deemed level of difficulty. People must answer items that are above a certain difficulty.
Item-mapping method
a way to make cut score with IRT-based method.
Arranges items on histogram, judges judge whether the questions will be answered right at least half the time = cut score.
Bookmark Method
a way to make cut score with IRT-based method.
Experts place a ‘bookmark’ between two pages in a given booklet that separate minimum difficulty (cut off)
Drawbacks of IRT-based methods
Need expert opinion, floor/ceiling effects, optimal numbers of items in booklet (for bookmark method)
Method of predictive yield
Thorndike’s method of setting cut scores.
Needs posititions to be filled, projections of likelihood to offer acceptance, and distribution of applicant scores.
Discriminant Analysis
See relationship between identified variables and two naturally occurring groups. (Discriminant Function Analysis)