Al psychometrics notes Flashcards
What are the 2 processes that are part of test standardization?
- Uniform administration and scoring procedures
2. development of test norms
What does reliability refer to?
A test’s consistency
What does reliability provide no information on?
What is being measured
What does classical test theory propound?
That an obtained test score (X) is composed of two additive and independent components:
- True score (T): actual status on the attribute
- Error (E): random
What is the ideal (but unobtainable) formula for reliability?
True variance/observed variance
What do reliability estimates assume?
- Variability that is consistent is true variance
2. Variability that is inconsistent is random error
What is the range of a reliability coefficient?
0.0-1.0
What does a reliability coefficient of 0.0 indicate?
That all variability obtained in a test’s scores is attributable to measurement error
What does a reliability coefficient of 1.0 indicate?
That all variability obtained in a test’s scores reflects true score variability
What is the difference between the reliability coefficient and other correlation coefficients?
It is never squared
What does the reliability coefficient estimate?
The proportion of variability in obtained test scores that reflects true scores
What are the 5 main types of reliability?
- Test-retest reliability
- Alternate-forms reliability
- Split-half reliability
- Coefficient Alpha
- Inter-rater reliability
What is test-retest reliability?
The test is given to the same group twice, and then the two sets of scores are correlated
What is the coefficient given from test-retest reliability?
A coefficient of stability (tests the degree of stability over time)
What is the source of measurement error in test-retest reliability?
Time sample error
random factors between the two test administrations: examinees fluctuations (e.g., anxiety) etc.
What kind of tests is test-retest reliability most suitable for?
Aptitude tests - a stable characteristic
What kind of tests is test-retest reliability least suitable for?
Tests of mood - fluctuates over time
What do you do in alternate-forms reliability? What does it indicate?
2 equivalent tests are administered to the same group, and then the two sets of scores are correlated
It indicates the consistency of responding to different item samples
What is the coefficient derived from alternate forms reliability?
Coefficient of equivalence
In alternate-forms reliability, when the forms are administered at different times the test also measures consistency over time - what is the reliability coefficient derived?
Coefficient of equivalence and stability
What kind of error is associated with alternate-forms reliability?
Content sampling
The interaction between different examinee’s knowledge and the different content assessed by the items in the forms. e.g., Form A matches one examinee’s knowledge better than Form B
Alternate-form reliability is a rigorous form of reliability, but what is the problem with it?
It is difficult to develop truly equivalent forms
When is alternate-form reliability inappropriate?
When the attribute is likely to fluctuate over time
In what two way are split-half reliability and coefficient alpha similar?
- Both involve administering a test once to a single group
2. Both yield a reliability coefficient called a “coefficient of internal consistency”
How is split-half reliability conducted?
The test is split into halves so that each examinee has two scores. The scores on the two halves are then correlated.
What is a problem with split half reliability?
It yields a coefficient derived from 1/2 test length (remember the reliability decreases as the length of a test decreases).
A problem with split-half reliability is that it is derived from only half a test, what does this mean?
It underestimates true reliability
Split-half reliability underestimates true reliability, how is this corrected?
using the Spearman-Brown prophecy formula
What is the full name for the coefficient alpha?
Cronbach’s coefficient alpha
How is the Cronbach’s coefficient alpha derived?
The test is administered to one group of examinees at a single time point. The formula then determines average inter-item consistency. Average reliability is then obtained from all split tests.
The coefficient alpha is conservative, and consequently can be considered a X of the test’s reliability
lower bound estimate