Test Worthiness, Pt I Flashcards
Test Worthiness
Four Cornerstones
Validity
Reliability
Cross-Cultural Fairness
Practicality
Test Worthiness
Correlation Coefficient
Correlation
Statistical expression of the Relationship between two sets of scores (or variables)
Test Worthiness
Correlation Coefficient
Positive Correlation
Increase in one variable accompanied by an increase in the other variable
“Direct” relationship
Test Worthiness
Correlation Coefficient
Negative Correlation
Increase in one variable accompanied by a decrease in the other variable
“Inverse” relationship
Test Worthiness
Correlation Coefficient
Correlation coefficient (r)
A number between -1 and +1 that indicates Direction and Strength of the relationship
As “r” approaches +1, strength increases in a direct and positive way
As “r” approaches -1, strength increases in and inverses and negative way
As “r” approaches 0, the relationship is weak or non existent (at zero)
Test Worthiness
Correlation, cont’d
The closer to 1 or -1 the stronger the correlation
Graph, Class 1, slide 5
1/0 is a PERFECT positive correlation
-1/0 is a PERFECT negative correlation
Reliability
Accuracy or Consistency of test scores
Would one score the same if they took the test over, and over, and over again?
Classical Test Theory
Assumes a Priori that any measurement of a human personality characteristic will be inaccurate to some degree
Charles Spearman (1904)
Observation + True Score + Error
X = T + E
Sources of Measurement Error
Item Selection
Test Administration
Test Scoring
Systematic and unsystematic measurement error
Systematic and Random Error
Systemic Error
Impact All People Who Complete and Instrument (such as misspelled words or poorly conceived sampling of bx represented by the instrument’s questions).
Systematic and Random Error
Unsystematic Errors
Involve factors that affect Individual Expression of a Trait
Item Response Theory
Item Response Funtion
Relationship between Latent Trait and Probability of Correct Response
Usual standard score range -3 to +3
Item difficulty parameter
Item discrimination parameter
Item Response Theory
Invariance in IRT
Individual trait level can be estimated from any set of items
IRFs do not depend on the population of examiners
Rasch Scale
Based on Item Response Theory, the relationship between the Test Taker’s Probability of success on an item and the latent trait (e.g., the ability)
Test taker’s ability vs. item difficulty (both will vary)
The items are used to define the measure’s scale
Goal: Person’s ability - Item difficulty
Test-taker receives multiple items that matches their ability
Protects against the ceiling effect
See graph on Class 1, slide 18
Rash Scale & Discriminatory Power
When an item measures a construct (has a good fit) the levels of the item will ovary with the train
When an item does not measure a construct, the levels of the item will not co-vary with the trait
Four ways to determine Reliability
Internal Consistency A. Split-half or Odd Even B. Coefficient Alpha C. Kundera-Richardson Test-Retest Alternate, Parallel, or Equivalent Forms Inter-rater reliability
Internal Consistency
Reliability within the test, rather than using multiple administrations
Internal Consistency
3 Types
Split-Half or Odd-Even
Corn Bach’s Coefficient Alpha
Kundera-Richardson
Internal Consistency
Split-Half of Odd-Even Reliabilty
Correlate on half of test with other half for all who took the test
The correlation = the split half reliability estimate
The Spearman-Brown coefficient corrects for sampling bias
Internal Consistency
Spearman-Brown Formula
See Class 1, Slide 23
Internal Consistency
Cronbach’s Coefficient Alpha
Developed by Lee Cronbach in 1951
A formula for estimating the mean of all possible Split-Half Coefficients using items that have Three or more response possibilities or anchor definitions
**Report reliability coefficient for total and/or each scale or subtext
Basics of Cronbach’s Coefficient Alpha
Cronbach’s alpha reliability coefficient normally ranges between 0 and 1
Closer alpha coefficient = 1.0, > internal consistency of the scale items
Standardized Item Alpha: Alpha coefficient when all scale items have been standardized (made into z scores).
This coefficient is used only when the individual scale items are not scaled the same
Internal Consistency
Kuder-Richardson
(KR-20) (KR-21)
Variation on alpha formula used with dichotomous data
An estimate the mean of all possible split-half coefficients
Test-Retest Reliability
Give the same test Two or More Times to the Same Group of People then correlate the scores.