WK 2 Norms and Reliability Flashcards
Norms and Reliability
Describe the main premise of classical test theory and how it relates to reliability?
CTT says that every person’s observed score is made up of the true score (of the trait) as well as partly error. For a population, the total variance is the true variance + the error variance. Reliability refers to the proportion of true variance divided by the total variance. That is, reliability is directly influenced by true variance, but note - we can only ever estimate the true variance.
Describe measurement error?
Measurement error is also known as error variance. It is made up of both systematic error (predictable and constant) and random error (unpredictable, unrelated, noise). Random error is good because it should balance out in the end and result in a similar mean. Systematic is less good, but if you know what could be affecting it, you can adjust your numbers accordingly.
List common sources of measurement error
- Test Construction
- Test Administration
- Test Scoring and Interpretation
- Sampling Error
- Methodological Errors
Describe Test Construction error
Variation due to differences in items on same test or between tests
Describe Test Administration error
Variation due to testing environment
(test-taker: anxiety, stress, drugs, sleep, physical discomfort)
(Examiner: appearance, demeanour)
Describe Test Scoring and Interpretation error
Variation due to scoring and interpretation e.g. scoring a video on warmth behaviours of a mother towards aggressive child
Describe Sampling Error
Variation due to representativeness of sample e.g. doesn’t gather sample that represents a population, instead only educated people
Describe Methodological Errors
Variation due to poor training, unstandardised administration, unclear questions, biased questions
What is the difference between CCT and IRT?
CTT assumes just two components to measurement and that all items have equal ability to measure the target in question.
IRT is very powerful in understanding the power of an item in finding latent traits, it examines items specifically and can reveal different levels of the latent trait being exmained
IRT incorporates considerations of item difficulty and discrimination. Can you describe what they mean in the context of IRT?
Difficulty relates to the ability of an item to be completed, solved or comprehended
Discrimination refers to the degree to which an item differentiates between high and low levels of the construct. E.g. if the discrimination slope is steep, it is good at discriminating between different levels
List the common estimates of reliability.
- Test-retest reliability
- Parallel and Alternate Forms Reliability
- Internal consistency reliability (split-half, inter item correlation, Cronbach’s alpha)
- Inter-rater/ inter-scorer reliability
Describe Test-retest reliability
Estimate of reliability over time/ the consistency of a test over time
How? Correlate Pairs of scores from the same people, on the same test, at different time points
Good for? Stable variables e.g. Personality
Bad? Estimates tend to decrease as time passes
Not good for fluctuating variables e.g. Mood
Describe Parallel and Alternate Forms Reliability
if the MEANS and VARIANCE are equal in both versions of a test = PARALLEL
If not = ALTERNATE
How? Correlate the scores of the same people measured by the different forms
E.g. Does cognitive function improve over time: use the Montreal Cognitive Assessment (MOCA): two different versions: Patient can’t use answers from first version to help them in second
Describe split-half (internal consistency)
How? Correlate equivalent halves of the one test with each other, then generalise the half-test reliability to the full-test internal consistency reliability Spearman-Brown Formula By changing the ‘n’ of your final test, you can manipulate the reliability of your test.
S-B predicted reliability = (nhalf-correlation)/ 1 + (n-1) half-correlation
Describe inter-item consistency/ correlation (internal consistency)
the degree of relatedness of items on a test. HOMOGENEITY. Basically you get the average of inter-item correlations