Chapter 2 Principles of Language Testing Flashcards
Authenticity
degree of correspondence of the characteristics of a given language test task to the features of a target language task. as natural as possible, meaningful topics, real-world tasks
Washback Effect
Effect of testing on teaching and learning. includes the effects of an assessment on teaching and learning prior to the assessment itself
Practicality
whether a test should be implemented in a specific context in consideration of time, cost, and available resources
Reliability
stability and dependability of a test. the arguments on whether a test is able to measure individual competence in a consistent and dependable manner
Classical Test Theorem (Reliability)
X (observed score) = T (true score) + E (error)
Student-Related Reliability
The most common learner-related issue in reliability is cause by temporary illness, fatigue, a bad day, anxiety, and other physical factors, which may make an observed score deviate from one’s true score
Test Administration Reliability
Unreliability may also result from the conditions in which the test is administered (room lighting, classroom conditions)
Rater Reliability
Human error, subjectivity, and bias may enter into the scoring process
Inter-rater reliability
occurs when two or more scorers yield inconsistent scores of the same test, possibly for lack of attention to scoring criteria, inattention or even preconceived biases
Intra-rater reliability
common occurrence for classroom teachers because of unclear scoring criteria, fatigue, bias toward particular good and bad students, or simple carelessness
Test-Retest reliability
A reliability estimate that is based on testing the same examinees twice with the same test and then correlating the results
Validity
extent to which inferences made from assessment results are appropriate, meaningful, and useful in terms of the purpose of the assessment
Face validity
to the degree to which a test looks right, and appears to measure the knowledge or abilities it claims to measure, based on the subjective judgment of the examinees who take it
Content Validity
A conceptual or non-statistical validity based on a systematic analysis of the test content to determine whether it includes an adequate sample of the target domain to be measured
Criterion-related validity
refers to the extent to which the “criterion” of the test has actually been reached