14 - Bias Flashcards
Bias
Systematic measurement error
Predictive vs Test Structure Bias
Predictive: differences between groups in criterion prediction
- Focuses on total score of the measure (external criterion) - Steeper the slope (regression), the better the prediction
Test Structure: differences in internal test characteristics between groups
- Focuses on test itself (internal criterion) - Can be empirical or theoretical
Prediction Scatterplot
- Predictor (test score) is on x axis
- If we shift x-axis, we are shifting base rate
- Criterion (job performance) is on y axis
- If we shift y axis, we shift selection rate
- Regression Line (line of best fit)
- Good = steep slope, ideally through the origin (0,0), with few FP and FN
- Bad = shallow slope, equivalent datapoints in false and true quadrants
Predictive Bias: Different Slopes
- Different slopes = differential validity
- Meaning, different predictive validity for different groups (it is a better predictor for one group (steeper slope) than the other)
-Correction: use a different measure for the minority group for big slope differences
(for small slope differences, use within-group norming)
Predictive Bias: Different Intercepts
- Different intercepts = systematic over- or under-estimation of group performance
- Meaning, the same test score leads to different predictions for groups
Correction: add bonus points to the minority group
*NOTE: over- and under-estimation is counterintuitive in this graph! Underestimation = y intercept greater than 0.
Predictive Bias: Different Slopes AND Intercepts
-Poor differential validity (does a poor job of predicting outcomes for one group; SLOPE) and over- or under-estimates one or both groups (INTERCEPT)
Correction: use a different test altogether
Empirical Approaches to Test Structure Bias
- Item x group tests (ANOVA) : examines whether differences between groups on the overall score matches comparisons among smaller item sets between groups
- Used to rule out that items operate in different ways for different groups
- IRT (difficulty/severity) -> Differential item functioning (between groups); items function differently between groups
- If ICC shows group differences = construct validity variance between groups = BIAS
- Difficulty: inflection, where middlepoint of sigmoid curve hits the x axis
- Discrimination: slope; steeper = more discriminatory (how well the item distinguishes between those higher/lower on construct)
- If ICC shows group differences = construct validity variance between groups = BIAS
- Confirmatory Factor Analysis: examines whether the factor structure of underlying variables is consistent across groups (tests of measurement invariance)
- SEM: multiple predictors, multiple outcomes; identifies # of latent variables
Theoretical Approach to Test Structure Bias
- Facial Validity: lay person
- Content Validity: expert
- A construct may include content facets in one group, but may include different facets in another.
Fairness
ACCURACY =/= FAIRNESS!
Adverse Impact: rejecting members of one group at a higher rate than another; cannot have group (B) selection be fewer than 80% of highest selected group (A)
Operationalizing fairness:
- Equal outcomes: equal selection rate
- Equal opportunity: equal sensitivity (classification errors)
- Equal odds: equal sensitivity and specificity
Score Adjustments to Correct for Bias
- Bonus Points: add points to particular group; used to correct for intercept predictive bias
- If group differences in SD, BP may not correct bias - Within-Group Norming: corrects for slop predictive bias
- Group norms for relative functioning, common norms for absolute functioning - Separate Cutoffs (BP): per group
- Top-Down Selection From Different Lists (WGN) : take best from each group
- Banding (BP): band width is equivalent to StErMeas, give minority preference
- Banding with BP: BP first, then bands
- Sliding Band (BP): select all minority members in a band and JUST majority with top score; then repeat with next bands
- Separate Tests (WGN): option for different slopes
- Item elimination based on groups (WGN): if large group differences
Non-score Adjustment Techniques to Correct for Bias
- Use multiple predictors/tests
- Change the criterion
- Remove biased items
- Resolve biased items (retain each item but alter parameters for different groups)
- Use alternative modes of testing
- Use work records
- Increase time limit
- Use motivation sets (face validity)
- Use instructional sets (access to test prep)