M&D3 Week 1 Flashcards
Reliability
the consistency of a measurement repeated within a person or in a sample
test score
X = T + E
T (true score) is not known, E (error) is only estimated (X = the obtained score)
Accidental measurement errors (unpredictable)
Something you can’t predict beforehand, time of day, headache
Systematic measurement errors (constant)
Something you can predict beforehand, happens systematically
Confidence interval formula
CI = X +/- z * SE
X = score; z = standardized coefficient, SE = standard error
z = 0.99
the chances (odds) is 68% that the true score is located within the interval (lower precision, lower CI)
z = 1.96
the odds are 95% that the true score located within the interval (medium precision, medium CI)
z = 2.58
the odds are 99% that the true score located within the interval (higher precision, higher CI)
Confidence intervals indicate…
the limits within which a certain possible score may be
assumed to be true
Standard error (SE)
Standard deviation of raw scores around true scores (SD of measurement errors)
SE formula
SE = σ * √(1 - rxx)
The higher rxx
the lower SE
The lower σ
the lower SE
Test-retest
A specific test is used multiple times or at least two times (coefficients between these tests)
Parallel (or alternative) versions
Coefficients between the different versions
Internal consistency measures (split-half, Cronbach’s alpha, KR-20)
Test used one time! KR-20 is for dichotomous items (only two options)
Raters coefficient: interclass correlation (ICC)
Consistency between responses of a specific group/reactions of a specific team or group
Alpha coefficient is based on
- A single measurement of a test
- (Co)variances of the items
- The number of items – how many items do you have
insufficient alpha coefficient
rxx < .80
Validity
The extent to which a test measures what it should measure
Face validity
How the test seems externally (to laymen/test takers)
Construct validity
To what extent is the test a good measurement of the underlying theoretical concept?
Convergent validity
To what extent is the test correlated with other measures measuring the same
concept (should have high pos or neg correlation)
Divergent validity
To what extent is the test correlated with other measures measuring a different concept (should have no correlation)
Criterion validity, diagnostic validity
How much the test predicts a concrete criterion, how much it has value in diagnosing
Content validity
Does the test cover the domain of knowledge, skills, behavior that we’re supposed to measure?
Internal structure
the structure of the test alone
o Number of dimensions (factors)
o Score differences between groups (e.g., high versus low extraversion)
External validity
how the questionnaire correlates with other measures
Multitrait-multimethod matrix used to test
for convergent and divergent validity
Convergent validity
Hetero-method (more than one method)
Mono-trait (the same trait measured by different methods)
Divergent validity
Across two or more methods
Hetero-method (more than one method)
Hetero-trait (different traits)
Method variance
Within a single method
mono-method
hetero-trait
FOUR RULES FOR TESTING CONSTRUCT VALIDITY BASED ON MTMM MATRIX
- Convergent validity (average) > 0
- Convergent validity (average) > divergent validity (average)
- Convergent validity (average) > method variance (average)
- Method variance and divergent validity approximately the same pattern in correlation matrix
Immediate criteria
work sample in an Assessment Centre
Low delay
first client satisfaction
High delay
annual evaluation
Operationalization
changing the conceptual ‘thing’ into something measurable
Incremental = increased validity
how much personality adds to measuring cognitive capacity
COTAN standards in personnel selection, a validity coefficient of … is seen as sufficient
.40
Cohen (1977) standards (effect sizes) :
r = .10 low effect size
r= .30 medium
r= .50 high
Possible problems with values that are too high (applies to other values too such as
correlation coefficients)
- Incorrect conceptualizations of your measure
- Selection of instrument to test for criterion validity is not correct
The correction for attenuation formula
assumes there is an ideal value of validity & at some point, if instruments are perfectly reliable, the maximum available can be calculated