Test Worthiness, Pt I Flashcards
Test Worthiness
Four Cornerstones
Validity
Reliability
Cross-Cultural Fairness
Practicality
Test Worthiness
Correlation Coefficient
Correlation
Statistical expression of the Relationship between two sets of scores (or variables)
Test Worthiness
Correlation Coefficient
Positive Correlation
Increase in one variable accompanied by an increase in the other variable
“Direct” relationship
Test Worthiness
Correlation Coefficient
Negative Correlation
Increase in one variable accompanied by a decrease in the other variable
“Inverse” relationship
Test Worthiness
Correlation Coefficient
Correlation coefficient (r)
A number between -1 and +1 that indicates Direction and Strength of the relationship
As “r” approaches +1, strength increases in a direct and positive way
As “r” approaches -1, strength increases in and inverses and negative way
As “r” approaches 0, the relationship is weak or non existent (at zero)
Test Worthiness
Correlation, cont’d
The closer to 1 or -1 the stronger the correlation
Graph, Class 1, slide 5
1/0 is a PERFECT positive correlation
-1/0 is a PERFECT negative correlation
Reliability
Accuracy or Consistency of test scores
Would one score the same if they took the test over, and over, and over again?
Classical Test Theory
Assumes a Priori that any measurement of a human personality characteristic will be inaccurate to some degree
Charles Spearman (1904)
Observation + True Score + Error
X = T + E
Sources of Measurement Error
Item Selection
Test Administration
Test Scoring
Systematic and unsystematic measurement error
Systematic and Random Error
Systemic Error
Impact All People Who Complete and Instrument (such as misspelled words or poorly conceived sampling of bx represented by the instrument’s questions).
Systematic and Random Error
Unsystematic Errors
Involve factors that affect Individual Expression of a Trait
Item Response Theory
Item Response Funtion
Relationship between Latent Trait and Probability of Correct Response
Usual standard score range -3 to +3
Item difficulty parameter
Item discrimination parameter
Item Response Theory
Invariance in IRT
Individual trait level can be estimated from any set of items
IRFs do not depend on the population of examiners
Rasch Scale
Based on Item Response Theory, the relationship between the Test Taker’s Probability of success on an item and the latent trait (e.g., the ability)
Test taker’s ability vs. item difficulty (both will vary)
The items are used to define the measure’s scale
Goal: Person’s ability - Item difficulty
Test-taker receives multiple items that matches their ability
Protects against the ceiling effect
See graph on Class 1, slide 18
Rash Scale & Discriminatory Power
When an item measures a construct (has a good fit) the levels of the item will ovary with the train
When an item does not measure a construct, the levels of the item will not co-vary with the trait
Four ways to determine Reliability
Internal Consistency A. Split-half or Odd Even B. Coefficient Alpha C. Kundera-Richardson Test-Retest Alternate, Parallel, or Equivalent Forms Inter-rater reliability
Internal Consistency
Reliability within the test, rather than using multiple administrations
Internal Consistency
3 Types
Split-Half or Odd-Even
Corn Bach’s Coefficient Alpha
Kundera-Richardson