Test Construction Flashcards
Item Analysis
Determine which items to retain in final test
Item Difficulty Index (p)
ranges 0 to 1 - 0.5 diff. level preferred
of exminees ans correct/total examinees
Item Discrimination (D) (ranges from -1 to 1)
%examinees in upper scoring grp -%examinees in lower scoring grp
Reliability Coefficient (RC)
- Estimates tests reliability (variability)
- Ranges from 0 to 1
- .91 RC=91% due to true score variability & 9% due to measurement error
Methods for Estimating Reliability
- Test-retest (coefficient of stability)
- Alternate forms (coeff. of equivalence)
- Split-half (internal consistency reliability)
- Spearman-Brown used w/^ (determine test’s true reliability)
- Coeff. Alpha (inter item not 2 halves)
- Kuder-Richardson-substitute for co. alpha when items scored dichotomously
- Inter-rater (when scored subjectively)
- Coeff. of concordance (interrater & ranks)
Spearman-Brown Formula
- Estimate effects of lengthening or shortening a test on reliability coeff.
Std Error of Measurement (SEM)
- How much an individual’s obtained score reflects his/her true score
- std deviation of test scores x sq root of 1 minus reliability coeff.
Validity
A test is valid when it accurately measures what it is designed to measure
Content Validity
When test will be used to measure one or more content/behavior domains
Construct Validity
When test will be used to measure hypothetical trait (construct) e.g. achievement, intelligence or mechanical aptitude
Criterion-related Validity
When a test will be used to estimate or predict performance on another measure
Construct Validity
- Convergent
- Divergent
Convergent - high correlations w/ measures that assess the same construct
Divergent - low correlations w/ measures of unrelated characteristics (=discriminant validity)
Multi-trait Multi-Method Matrix (MTMM)
Convergent and Discriminant Validity
- Monotrait-heteromethod large: Convergent validity
- Heterotrait-monomethod & heterotrait heteromethod small: Discriminant validity
Factor Analysis
- determine construct validity
- factor matrix
- factor loading (shared variability sq. coeff)
- Test has 0.5 correlation with Factor 1 = 25% of variability in test scores is explained by Factor 1
Orthogonal Factors (unrelated)
- communality calculated by summing the factor loadings
- Factor 1=.50 Factor 2=.20 (communality=.29)