Ch. 5 Justifying the Use of Language Assessments (AUA) Flashcards
What are some consequences that stakeholders may encounter from use of language assessments?
Stakeholders could face either beneficial or detrimental outcomes/consequences. Stakeholders: test developer, test user (decision maker), individuals/programs/institutions/organizations that are affected by intended consequences. Beneficial: Intended consequences that lead to useful decisions that are valid. Detrimental: Unintended consequences, because why would anyone intend detrimental consequences? Negative washback. Invalid decisions. False positive and False negatives.
What are the different elements of language assessment use?
- Consequences
- Decisions
- Interpretations about test taker’s ability
- Procedures (observing and recording performance)
- Inferential links (Performance –> Consequences)
(Series of claims backed by warrants for all of the above)
Discuss accountability issues in language assessment.
Test developers and users need to demonstrate to stakeholders that the intended uses of our assessments are justified.
(Would not want to use an assessment inappropriately and cause any negative consequences that impact lives)
If a test is accused of being unfair or biased…
the AUA needs to include claims and warrants that elaborate the details of what and why items were included & excluded on the test.
To whom do we need to be accountable?
stakeholders, individuals, institutions directly affected by test use
test takers, teachers, parents, schools, employers, etc.
Justification of Assessment Use
- Articulate the specific claims & warrants in AUA that support the links between consequences and performance
- Collect relevant evidence, backing, to support AUA claims and warrants
Why?
- Guides development process to go smoothly, and ensures quality control
- Developers and decision makers are held accountable to those who will be affected by the decisions from performance.
What are some of the practical reasoning that underlies the AUA?
Toulmin’s terminology (2003)
Data
Claim
Warrant (implicit)
Warrant backing
Rebuttal
Rebuttal backing
Counterclaim
AUA: Describe the Data, Claims, Warrants, Rebuttals, and Backing
- Data - information on which a claim is based (test taker’s performance)
- Claim - inferences made on basis of data. Outcome of the assessment and qualities of that outcome. (Score, interpretations)
- Warrants - explicit statements that elaborate claims
- Rebuttals - challenge or reject the claims
- Backing - support for warrants or rebuttals
What are the four general claims and warrants in an AUA?
Claim 1 (Intended consequences): The consequences of using an assessment and of the decisions made are beneficial to all stakeholders.
Claim 2 (Decisions): The decisions that are made on teh basis of the assessment-based interpretations are value-sensitive and equitable.
Claim 3 (Interpretations): The interpretations about the ability to be assessed are (a) meaningful, (b) impartial, (c) generalizable, (d) relevant, (e) sufficient.
Claim 4 (Assessment record): Assessment records are consistent across different assessment tasks, different aspects of the assessment procedure, and across different groups of test takers.
For Claim 1 (Intended Consequences), discuss the consequences to different stakeholders (i.e. test takers, teachers and organizations/society)
For test takers, these tests affect (a) the way they prepare for and take the exam, (b) feedback they receive (interpretations, permanent records, relevance), (c) decisions made based on these results (low stakes to high stakes)
For teachers, washback is a concern. (i.e. teaching to the test, could be good or bad) Also, the decisions about resources for programs.
For education systems and society in general, washback is a concern as well. It will determine the future of education and our youth.
For Claim 2 (Decisions), discuss values sensitivity and equitability.
Values sensitivity means how much the use of the assessment and decisions take into consideration the existing educational values and legal requirements.
Claim 2 needs to include warrants specifically about these values and legal requirements of the region.
Equitability refers to the degree that different test takers who possess equivalent levels of ability have equal chances of being classified into the same group.
Claim 2 needs to explicitly state that potential biases were considered and left out
For Claim 3 (Interpretations), discuss what is meant by meaningfulness, impartiality, generalizability, relevance and sufficiency.
Meaningfulness: does it provide stakeholders with information about the ability to be assessed, and is it conveyed in terms they can understand and relate
(Test taker’s response on test must engage the ability in question, so the warrants should show how the test results are meaningful in a given situation)
Impartiality: Free from bias. Format, content, familiarity, disabilities, equal access
(Warrants should show that interpretations are impartial)
Generalizability: assessed language items correspond to the TLU domain in question
(Warrants should show that the test taker’s response to items are representative to how the test taker would respond in an equal/generalizable TLU domain)
Relevance: how well does the interpretation provide the information decision makers need to make the correct decision
(Warrants should discuss the needs of test user and the decision maker and how it relates to the inferences and test itself)
Sufficiency: Does the interpretation provide enough information for decision maker, extension of relevance, know decision maker’s comfort zone for info
(Warrant should emphasize the importance of test developer to work with test user and decision makers to understand their comfort zone, what is considered sufficient interpretations regarding the decision to be made)
For Claim 4 (Assessment records), discuss consistency across different assessment tasks, procedures and groups of test takers.
Consistency of scores: If the same test were to be administered to the same group of individuals on two different occasions, in two different settings, their score should be the same. (Same information about ability across different procedures, tasks, times, raters, etc.)
(Reliability is important for Claim 4)
(Warrants would show how the test and interpretations are reliable)
Take into consideration Rebuttals and unintended consequences. False positives and False negatives.
Where does the issue of fairness and bias fall into the picture of AUA? (Which parts of which claims?)
Equitability of treatment: (source - access to test, administration, recording results)
Claim 4 (Assessment record) - Consistency warrants
Claim 3 (Interpretations) - Impartiality warrants
Claim 2 (Decisions) - Equitability warrants
Claim 1 (Consequences) - Benefit warrants
Absence of bias: (source - format, content, inconsistent scoring)
Claim 3 - Impartiality warrants
Claim 4 - Consistency warrants
Fair use: (are individuals fully informed about how the decisions will be made and used?)
Claim 3 - Sufficiency warrant
Claim 3 - Relevance warrant
Claim 2 - Equitable warrant
Claim 1 - Benefit warrant