Validity Flashcards
Def.: Validity
Degree to which the inferences we derive about job performance from some predictor are accurate (Schmitt & Sinha, 2011); the degree to which evidence and theory support the interpretation of test scores entailed by proposed uses of tests (Sackett et al., 2012)
Def: Predictive inference
Scores on the predictor measure of inference can be used to draw interferences about an individual’s future job behavior or other criterion of interest (Sackett et al., 2012)
We build a validity argument based on both descriptive and predictive inferences (think Binning and Barrett, 1989; Guion, 2002)
Def: Validation
it’s a process. The validation process should be thought of as a form of hypothesis testing (process) to collect the best, most informative information possible. We should take on a “preponderance of evidence” approach, rather than a checklist of validation models. This is the Unitarian view, rather than the Trinitarian view. We should not allow ourselves to be limited by administrative practice of this Trinitarian view because it diminishes the quality of the evidence we can gather
Pieces of validity evidence from the standards
- Criterion-related:
- Content: how well the content of the measure samples a content domain
- Convergent/discriminant validity (construct):
- Validity generalization evidence (MA)
- Test taker processes during test
- Correlational/factor analyses
- Item analyses/internal structure
- Consequences of decisions based on tests (controversial though, Sackett no likey)
Importance
Validity is typically considered the most important issue in psychological testing given its importance for the meaning placed on tests and modes of measurement.
Trends
Issues in how we now conceptualize validity today (Sackett et al., 2012):
Issue 1: Validity as a predictor-criterion relations vs. broader conceptualizations
Issue 2: Validity as an inference vs. validity as a test
Issue 3: Types of Validity Evidence vs. Types of Validity
When a multifaceted set of claims is made about inferences that can be drawn from the test, support for each claim is needed (e.g., represent. content sampling? Content validity (e.g., documentation usually w/ SMEs; multidimensionality? Factor analytic evidence (one type of construct V); prediction of task perf.? (criterion-related validity evidence)
Issue 4: Validity as an inference about a test score vs. validity as a strategy for establishing job relatedness
Need to differentiate b/w settings in which types of validity evidence are being used to draw inferences about the meaning of a test score rather than to draw predictive inference
Issue 5: Validity limited to inferences about individuals vs. including broader consequences of test score use
Diff. researchers have diff. opinions but the Standards reject the view that consequences of test say anything about validity (and so do Sackett et al., 2012)
Issue 6: The predictive inference vs. the evidence for it
Content Validity
• Definition
o Demonstration that test items are a representative sample of the behaviors exhibited in some performance domain
• Factors
o Heavily dependent upon a job analysis that details job KSAOs and how those tasks (or very similar tasks) and KSAOs are reflected in the tests used to make employment decisions
• Steps
1. The job performance domain must be carefully specified
2. The objectives of the test use must be clearly formatted
3. The method of sampling item content from the performance domain must be adequate
o You traditionally want to use SME and make sure there is strong agreement between them
• 3 reasons why we should use content validation:
o Acceptability
o legal defensibility
o transparency
Criterion related Validity
• Significant relationship between a predictor and a job performance measure that is sufficiently strong (effect size
o You have to make sure that you are measuring the criterion correctly
• Techniques
o Correlation
o Incremental validity
o Differential prediction
Test for differences between the groups’ regression lines (intercepts and slopes)
Test for a significant interaction between the group and the predictor
DIF
• Ways of establishing link (Sackett et al., 2012) o Local criterion validation studies o Validity generalization o Transporatability studies o Synthetic validation
• Report both observed and corrected
Threats to validity
Self-selection (Kuncel & Klieger, 2007)
Applicant motivation (Schmitt & Ryan, 1992)
Issues w/ concurrent vs. predictive (Sussman and Robertson, 1986): Population of potential employees is of interest, but there is a problem with restriction of range—no way to “fix” a non representative sample (if people are eliminated from getting job, you can’t compare them to the high scorers…it’s like deleting the control group.); the right way of doing things is not always possible because of time, money and sample constraints. Also, there are problems with both concurrent and predictive designs. Do the best we can, and try to use multiple methods when possible for validation!
Publication bias (McDaniel et al., 2006)
Disadvantages of Failsafe N
It assumes the correlation in hidden studies in zero, rather than considering that some
studies could have an effect in the reverse direction, or an effect that is small but not
zero.
This approach focuses solely on statistical significance, as opposed to effect sizes.
It does not help you estimate the population coefficient in the absence of bias.
Trim and Fill
Assumes that sample estimates vary around a population figure. Also, assumes that
correlations based on large sample sizes have smaller confidence intervals (standard
errors), and should thus be closer to the population correlation that correlation from small
samples.
Correlations from small samples will often overestimate or underestimate the
population correlation.
Trim and fill defines asymmetry as evidence of publication bias.
Themes common across attitudes
- Measurement of job attitudes
• Typically measured with surveys (self-report) - Nomological network of job attitudes
• Issue of construct validity! What are we actually measuring??
• Why is this important?
o 1. They can’t be directly observed, so it’s important that the measure actually taps what we want to assess
o 2. Because there are numerous job attitudes in the literature, it is important that each attitude be distinguishable/unique from other attitudes (discriminant validity) - Criterion-related validity
• A measures ability to predict a criteria (certain outcome, performance, motivation, turnover, etc) - Role of meta-analysis in attitudinal study
• Statistically combining estimations of the relationships among variables from numerous studies to receive a more accurate assessment of the true relationship (weight estimate by sample size as well as possibly correcting for unreliability)
“I’m having a problem with such and such…” May be related to a specific attitude