Assessment Flashcards

Question 1

Q

Practicality

Answer

A

Time-efficient

Not excessively expensive

Question 2

Q

★ Reliability

Answer

A

No errors in scoring
Consistent and dependable : a reliable test should yield similar results.
Subjectivity

Question 3

Q

Inter-rater reliability

Answer

A

Two Ts evaluate by using the same rating scale.
Failure stems from lack of scoring criteria.

Subjectivity of the raters

Subjectivity doesn’t enter into the scoring process.

Question 4

Q

Intra-rater reliability이 violate되는 이유와 solution

Answer

A

Violation of such reliability can occur in case of unclear scoring criteria, fatigue, bias..

*soultion: careful specification of an analytic scoring instrument can increase both inter- and intra-rater reliability.

Question 5

Q

test reliability

Answer

A

items that have more than one correct anwer

Question 6

Q

student-related reliability

Answer

A

temporary illness, fatigue, illness

Question 7

Q

Validity

Answer

A

Test measures exactly what it is supposed to measure.

Question 8

Q

Authenticity

Answer

A

lg is natural, contexualized items,
includes meaningful, relevant, interesting topics
stimulates real-world tasks
provides some thematic organization to items through episode.
eg) reading passages selected from real-world sources that test-takers are likely to encounter/
listening comprehension sections feature natural lg with hesitations, white noise, and interruptions.

Topics and situations are interesting and relevant to my life.
Tasks replicates, or clearly approximate, real-world tasks.

Question 9

Q

Washback

Answer

A

formative
Give learners feedback that enhances their lg development.
How test influences both teaching and learning

Ts can provide information that washes back to Ss in the form of useful dialogues of strengths and weaknesses.

I expected the teacher to go over the test and give “advice” on what I should focus on in the near future.

No” feedback or comments” from the teacher were given.

Question 10

Q

washback 높이려면?

Answer

A

to comment generously and specifically on test performance.
“comments and feedback”
Letter grades and numerical scores give no information of intrinsic interst to the S.

Formative tests, by definition, provide washback in the form of information to the learner on progress and goals.

Informal assessment: T provides interactive feedback ->washback 높아져
Formal assement: T provides information on Ss’ progress toward goals -> washback 높아져

Question 11

Q

Criterion validity 정의와 두가지 종류 예시

Answer

A

하나의 새로운 시험을 기존 시험과 비교해서 타당성을 측정 : The extent to which the criterion of the test has actually been reached.

1) Predictive validity: e.g.) @ placement tests, admissions assessment batteries acheivement tests designed to determine Ss’ readiness to move on to another unit.
2) Concurrent validity: eg) high score -> actually proficiency in the lg.

Question 12

Q

Formative test

Answer

A

Formative tests, by definition, provide washback in the form of information to the learner on progress and goals.

Evaluationg Ss in the progress of forming their competencies and skills.
The delivery (by the T) and internalization

All kinds of informal assessment are formative.

Gather information on the developmental “process” of their speaking process
Assess their performance regularly

Question 13

Q

Summative test

Answer

A

Measure what Ss have grasped at the end of a course or unit of instruction.
* Evaluate only product not process

Summative test fails to provide crucial info.
(cf. formative test는 정보제공)

One major test at the end of semester

Question 14

Q

Norm-referenced

Answer

A

목적: to place test-takers along a mathematical continuum in rank order
primary concern: Practicability, realiability, validity

Such tests must have such fixed,predetermined responses.

Use the test results to award scholarships to the top 10%.

Question 15

Q

Criterion-referenced

Answer

A

The test is criterion-referenced, assessing the extent to which the students achieved the goals of the class.
Primary concern: authenticity, washback
(실생활에서 그 능력 사용한다는 목표. 즉, 시험과 실생활 간 일치정도 authenticity/ feedback측면에서 washback)

Give test-takers feedback in the form of grades.
The distribution of Ss’ scores across a continuum may be of little concern as long as the instrument assesses appropriate objective.

The Ss who get over 10 out of 16 will pass the conversation course.

Question 16

Q

Test administration reliability

Answer

A

Classroom conditions for the test are equale for all students.

ex) aural comprehension test -> street noise

Question 17

Q

Content validity

Answer

A

1) The tests assess real course objectives, direct testing
2) It requires test-takers to perform the behavior that is being measured.

Items focus on previously practiced in-class reading skills.

Question 18

Q

Construct validity

Answer

A

e.g.) conducting an oral interview
major components of oral proficiency: pronunciation, fluency,grammatical accuracy, vocab use, socio-linguistic appropriateness

e.g.) a simple written vocab quiz, covering the content of recent unit -> have Ss correctly define a set of words.
그런데, objective가 communicative use of words라면, writing of definitions certainly failes to match a construct of communicative lg use.

Question 19

Q

Face validity

Answer

A

Whether the test looks as if it is measuring what it is supposed to measure.

Tests that relate to their course work./ familiar task/ directions are clear

The printing was too small. had to read five pages in one hour.

Lots of tasks were unfamiliar
I’ve never done those kinds of tasks in class.
material that she had not dealth with in class
It seemed like a writing test rather than a listening test.

The exam “look like” one that high school Ss normally take.

Question 20

Q

needs analysis (needs assessment)

Answer

A

process of assessing the needs of Ss

Before designing course, it is necessary to make decisions about what would be taught and how it would be taught.

survey and interview

Info about what my Ss needed to learn or change, their learning styles, interestes, proficiency levels etc.

Based on the info, I decided on the course objectives, contents and activities.

Question 21

Q

a proficiency test/ standardized test

Answer

A

not linked to any particular textbook or specific course of study. (not limited to single skill in the lg. Rather, it tests “overall proficiency”.)

Summative and norm-referenced : provide results in the form of a single score, measure performance agaisnt a norm (w/ equated scores and percentile rank)

Not provide diagnositc feedback

Question 22

Q

summative feedback

Answer

A

Ss will receive a total score for the reading section

Question 23

Q

constructed-respons item

Question 24

Q

Item Response Distribution

Answer

A

a certain wrong alternative was chosen by a greater number of high group students than low group students.
more students chose the wrong alternative than those who chose the correct answer.
A certain wrong alternative did not work as a distracter.

Question 25

Q

the reliability of the test ***

Answer

A

Item18 deteriorates the internal consistency of the test.

low ability group Ss가 high ability group Ss보다 더 정답을 많이 맞추었을 경우

Question 26

Q

Item Facility

Answer

A

Item Difficulty
The extent to which an item is easy or difficult for the proposed group of test-takers
정답을 고른 학생의 비율 보여줌

Mr.Park divided the number of Ss who correctly answered a particular item by the total number of Ss who took the test.

Question 27

Q

Item Discrimination

Answer

A

The extent to which an item differntiates btw high- and low- ability test-takers

Item 20 shows the highest discrimination among the five items.

Item 2 does not distinguish the upper level Ss from the lower level Ss.

예) 어떤 문항에서 잘하는애와 못하는애가 같은점수 받았다 -> have poor ID, because it didn’t discrminate btw the two groups. INTERNAL CONSISTENCY

many upper group students incorrectly chose option C. (Item 2 does not distinguish the upper level Ss from the lower level Ss. )

Question 28

Q

Distractor

Answer

A

no one from the upper group and lower group chose option B.

Distractor a and b seem to be fulfilling their function of attracting some attention from lower-ability Ss.

Question 29

Q

portfolios

Answer

A

collections of Ss work

useful for assessing stuent performance: 1. Ss have ownership over the process of learning, 2. Portfolios allow T to pay attention to Ss’ progress as well as achievement.

Question 30

Q

Alternatives

Answer

A

portfolios
conference
Journals
self-assessment/ peer-assessment
observation

Question 31

Q

Alternatives

Answer

A

portfolios
performance-based assessment
conference
Journals
self-assessment/ peer-assessment
observation

Question 32

Q

performance-based assessment

Answer

A

The T observes the performance

The task is evaluated through “direct observation” by the T.

Question 33

Q

performance-based assessment

Answer

A

The T observes the performance
The task is evaluated through “direct observation” by the T.

The task calls for the integration of language skills.

Question 34

Q

analytic rating scales

Answer

A

diagnostic information 제공

Question 35

Q

holistic rating scales (holistic scoring method)

Question 36

Q

discrete point test

Answer

A

assessing one point at a time
On the assumption that lg can be broken down into component parts and that those parts can be tested successfully.

e.g.) grammar and vocab items in multiple choice format./ Large scale stnadardized entrance

Question 37

Q

integrative test 종류와 integrative test가 강조하는 것

Answer

A

Cloze test
Dictation

emphasizing communication and authenticity / communicative competence

Question 38

Q

Cloze test 종류/ 특징

Answer

A

Fixed-ratio cloze: Every nth word is deleted in a text

Rational-deletion cloze: Words are deleted in a text on a rational basis (eg. prepositions, sentence connectors) to assess specified grammatical or rhetorical categories.
Rational deletion이 more washback, expectancy grammar (ability to predict the next item)

특징) integrative+ reading ability 측정하는 indirect testing.

Question 39

Q

Rational deletion cloze

Answer

A

specific content words are chosen to be deleted
-> more washback, expectancy grammar (ability to predict the next time.)

scoring is more difficult in rational deletion cloze than c-test.

Question 40

Q

Cloze test scoring method 종류acceptable word method

Answer

A

a scoring method that accepts a suitiable,grammatically and rhetorically acceptable word that fits the blank in the original text.
(face validity 높다)

Question 41

Q

C-test 정의 및 특징

Answer

A

The second half of every other word is deleted
it has a higher scoring reliability
/ lower validity

Question 42

Q

Cloze test 정의 / Ss가 어떤 competence 사용하나/ 종류

Answer

A

an integrative measure not only of reading ability but of other lg abilities

Ss use linguistic competence (formal schemata)/ background experience ( content schemata)/ strategic competence

Fixed-ratio deletion
Rational deletion

Question 43

Q

Cloze test scoring method 종류exact word method

Answer

A

a scoring method that is limited to accepting the same word found in the original text

Question 44

Q

dictation

Answer

A

It taps into grammatical and discourse competence

Question 45

Q

Subjective testing

Answer

A

Low reliability/ high validity
Constructed resonse items

e.g.) open-ended response*

Question 46

Q

Objective testing

Answer

A

It has predetermined fixed responses
High test reliability, Low validity
Selected resonse

e.g.) T/F, multiple choice items

Question 47

Q

Direct testing

Answer

A

It involves the test-taker in accurately the target task.
High content validity

e.g.) Oral presentation, to test performance directly

Question 48

Q

Indirect testing

Answer

A

Learners are not performing the task itself but rather a task that is related in some way

Question 49

Q

Achievement tests

Answer

A

Limited to particular material and are offered after a course has focused on the obejectives in question
Determine whether the course objectives have been met by the end of a given period instruction

Summative: administrated at the end of a lessen,unit,or term of study
Formative: when offereing feedback about the quality of a learner’s performance

Question 50

Q

Placement tests

Answer

A

to place a student into a particular level of a lg curriculum or school
Diagnostic
Formative (correct/incorrect responses provide Ts with useful information on what may or may not be emphasized in the weeks to come)

Question 51

Q

Diagnostic tests

Answer

A

To diagnose aspects of a lg that a S needs to develop or that a course should include
-> Should elicit info on what Ss need to work on in the future. Therefore, a diagnostic test will typically offer more detailed, subcategorized information on the learner.

Question 52

Q

Constructed resonse items

Answer

A

A type of test item or task that requires test-takers to respond to a series of open-ended questions by wr,sp or doing something rather than choose answers from already-made list.

Question 53

Q

computer adabptive testing

Answer

A

computer testing software that adjusts the questions depending on Ss’ performance on previous test items.

Question 54

Q

Alternative tests (Performance-based assessment )

Answer

A

it requires Ss to perfrom,create,produce or do s/t.
use real-world contexts.
focus on process as well as products
tap into higher level thinking and problem-solving skills
provide info about both strengths and weaknesses of Ss
involve “an integration of lg skills”

Question 55

Q

Performance-based assessment T의 주의점

Answer

A

state the overall goal of the performance
specify the objectives (crieteria) of the performance in detail
prepare Ss for performance in stepwise progress
use a reliable evaluation form, checklist.
treat performances as opportunities for giving feedback and provide that feedback systematically
if possible, utilize self- and peer- assessment judiciously.

Question 56

Q

Rubrics

Answer

A

validity ↑, reliablity ↑
A rubric is a device used to evaluate open-ended, oral and written responses of learners
- usually composed of a set of criteria or competencies, each with descriptions of levels of expectation
- some rubrics involve scaling

Question 57

Q

Rubric-based assessment

Answer

A

not only were rubrics beneficial for teachers but Ss were also able to better focus their efforts, produce work of higher quality earn better grades, and feel less anxious about assignments.

장) rubrics provide points for Ss to focus on and goals to pursue
단) simplicity (makring a few points on a chart and consider our job is done!) may mask the depth and breadth of a S’s attainment.

Question 58

Q

Portfolios

Answer

A

a purposeful collection of Ss’ work that demonstrates their efforts, progress and acheivements.

장점) foster intrinsic motivation, responsibility and ownership

promote S-T interaction w/ the T as a facilitator
facilitate critical thinking, self-assessment and revision process
offer opportunities for collaborative work w/peers

Question 59

Q

포트폴리오 주의점

Answer

A

-State objectives clearly
-Give guidelines on what materials to include
(a sample portfoli from a previous Ss can help stimulate some thoughts on what to include)
- Communicate assessment criteria to Ss. (self-assessment : formative
-Provide positive washback - giving final assessments
e.g.) a holistic scoring scale ranging from 1 to 6.
narrative evaluation of perceived strengths and weakness by the T

Question 60

Q

Journals

Answer

A

the most formative of all the alternatives in assessment
CONTENT VALIDITY ↑, WASHBACK ↑ ↑

a log of one’s thoughts ,feelings, reactions, assessments, ideas, or progress toward goals, usually written w/ little attention to structure, form, or correctness.
“written conversation between T and Ss”

Question 61

Q

Dialogue journals

Answer

A

They imply an interaction between the T and the S through dialouges or responses
장점) practice in writing fluently, using writing as a thinking process, emphasizing a stuent’s own voice, afford a unique opportunity for a teacherto offer various kinds of feedback
* T becomes better accuainted with their Ss in terms of both their learning progress and their affective states
: meet Ss’ individual needs

단점) It’s difficult to set up criteria for evaluation

주의점 ) T should provide optimal feedback in your responses.
- cheerleading feedback, instructional feedback, in which you suggest strategies or materials, reality-check feedback -> help Ss set more realistic expectations for their lg abilities

Question 62

Q

self-assessment

/peer-assessment

Answer

A

autonomy, develop motivation

/ cooperative learning

Question 63

Q

Observation

Answer

A

observe Ss in the classroom
assess Ss s/o their awarness
naturalness of thier linguistic performance is maximized
Can take the form of recording, checklist, ration scales

Question 64

Q

Holistic scoring

Answer

A

an approach that uses a “single general scale” to give a global rating for a test-taker’s lg production

장) fast evaluation
단) no diagnostic info is avaible (no washback potential), raters need to be extensively trained to use the scale accurately

Answer 63

A

An approach that separtely rates a number of predetermined aspects (e.g. grammar, content, organization) of a test-taker’s lg production (e. writing)
=> establishing learners to hone in on weakness and caplitalize on strengths

PRACTICALITY ↓, in that more time is required for T to attend to details but ultimately Ss receive more information about their writing

Answer 64

A

e.g.) 설득하는 글쓰기 -> 설득하는 측면에만 초점두어 점수매기기

It allows both writer and evaluator to focus on function

Answer 65

A

Practicality ↑: time-saving scroing procedures, Reliability ↑: pre-determined correct responses

multiple choice itmes are all receptive, or selective response items in that the test-taker chooses from a set of responses.

STEM: the body of the item that presents a stimulus
Options/ Altnernatives - KEY

Answer 66

A

design each item to measure a single objective.
e.g.) WH-Q이 objective면 이것만 측정
+) Inadvertant (unintentional) clue 제공하면 X

2) State both stem and options as simply and directly as possible - remove needless redundancy from options and stem
3. Make certain that the intended answer is clearly the only correct one (Only one correct answer)

기출) make sure the distractors are the same grammatical class as the key 
/ make sure the key cannot be selected based on Ss' world kn.