Assessment Flashcards by Ranj Remedios

Differentiate measurement, assessment, and evaluation

measurement = assigning numerical values

assessment = collecting information

evaluation = making judgments

How well did you know this?

Not at all

Perfectly

Diagnostic exams and achievement test results from the previous year.

Is this assessment before, during, or after instruction?

Before

How well did you know this?

Not at all

Perfectly

Formative assessments

Is this assessment before, during, or after instruction?

During

How well did you know this?

Not at all

Perfectly

Summative assessment

Is this assessment before, during, or after instruction?

After

How well did you know this?

Not at all

Perfectly

Prof. Remedios wants to use the traditional paper-and-pen test for the PSYNTRO final examinations because it’ll be easy for him to administer it to his students and later, grade it objectively.

However, what disadvantages should he be mindful of?

Prep time is lengthy (seating arrangement, printing of test papers, developing items and answer key)
Students may cheat

How well did you know this?

Not at all

Perfectly

Instead of a traditional exam, Prof. Remedios wants his PERDEV students to do a performance in groups for their final examinations. He believes the preparation for it will be easy on his end, and students cannot cheat their way out of it.

However, what disadvantages should he be mindful of?

They may be graded subjectively if the rubrics aren’t used
Watching the performances will take too much class time

How well did you know this?

Not at all

Perfectly

For their final exams, Prof. Remedios asks his GEUSELF students to submit a portfolio so that he can see the growth and development in their self-understanding.

What are the disadvantages of this assessment though?

Making a portfolio is time-consuming
Rating their portfolios may be subjective if rubrics aren’t used.

How well did you know this?

Not at all

Perfectly

The Table of Specifications is a blueprint for selecting appropriate items.

What are its 4 parts?

Weight/Time Frame
Content Outline/Topics
Learning Competencies
No. of Items per Topic

How well did you know this?

Not at all

Perfectly

In the TOS, what formula do you use to calculate a topic’s number of items per learning competency?

Ex: How many items for Remembering?

Items = Weight * Percentage of Learning Competency * Total Number of Test Items

How well did you know this?

Not at all

Perfectly

True or False

The longer the test, the less reliable it is.

False. Longer = more reliable

How well did you know this?

Not at all

Perfectly

True or False

In creating a test layout, it is recommended to weave difficult questions in between easy ones.

False. Arrange items from easy to difficult.

How well did you know this?

Not at all

Perfectly

True or False

Items of the same type must be grouped together.

True

How well did you know this?

Not at all

Perfectly

In scoring a test, using [blank] is easier for large group of examinees

separate answer sheets

How well did you know this?

Not at all

Perfectly

In scoring a test, using a [blank] key covers all parts of the test except for the exact areas where the answers are written.

punched

How well did you know this?

Not at all

Perfectly

In scoring a test, an [blank] key is placed on top of the answer sheet and makes the correct answer visible.

Overlay

How well did you know this?

Not at all

Perfectly

In scoring a test, a [blank] key contains the correct answers that match the spaces where the answer is placed in the side of the test paper.

Strip

How well did you know this?

Not at all

Perfectly

True/False; Yes/No; Fact/Opinion

These are examples of what type of test format?

binary choice

How well did you know this?

Not at all

Perfectly

In providing a binary choice item, what disadvantage do you have to be mindful of?

There is a 50-50 probability of selecting the correct answer.

How well did you know this?

Not at all

Perfectly

What are the two types of multiple choice questions?

Choose the correct answer
Choose the best answer

How well did you know this?

Not at all

Perfectly

What are the three parts of a multiple choice item?

Stem (the question or prompt)
Answer
Distractors

How well did you know this?

Not at all

Perfectly

True or False

It is recommended that the stem of a multiple choice item be written in the positive form.

True

How well did you know this?

Not at all

Perfectly

Matching type items are best to use when measuring the student’s ability to identify the relationship between a set of similar items.

What is one disadvantage of this item?

being able to guess correct answers through the process of elimination

How well did you know this?

Not at all

Perfectly

In creating a matching type item, what do you put in Column A and Column B?

Column A: premises (usually in a logical order or alphebatized)
Column B: responses

How well did you know this?

Not at all

Perfectly

What are the three Selective Type test formats?

Binary Choice
Multiple Choice
Matching Type

How well did you know this?

Not at all

Perfectly

What are the two Constructed Response test formats?

Short Answer Essay

Enumerate the four primary elements in Avatar: The Last Airbender This is an example of what test format?

Short Answer - Constructed Response

What are the two types of Essay items and how are they different from each other?

Restricted response (specifies length and scope of the essay) Extended response (no length or scope)

What is the key difference between a holistic and analytic rubric?

Holistic rubrics have a weight for each criteria.

What type of rubric is best for formative assessments?

Analytic rubric

What is the difference between reliability and validity?

Reliability = consistency of scores across the conditions. Validity = if the test measures what it intends to measure

True or False A valid test is always reliable.

True

True or False A reliable test is always valid.

False

The reliability of a test is typically determined by using what coefficient?

Correlation coefficient (R)

A test is considered reliable if the correlation coefficient is not less than what score?

0.85

If a test shows a coefficient correlation (R) between 0.80 - 1.00, how can this be interpreted?

Very high relationship

If a test shows a coefficient correlation (R) between 0.60 - 0.79, how can this be interpreted?

High relationship

If a test shows a coefficient correlation (R) between 0.40 - 0.59, how can this be interpreted?

Substantial/marked relationship

If a test shows a coefficient correlation (R) between 0.20 - 0.39, how can this be interpreted?

Low relationship

If a test shows a coefficient correlation (R) between 0.00 - 0.19, how can this be interpreted?

Negligible relationship

If the same test is given at different times, what kind of reliability is being tested?

Test-Retest Reliability

If a different version of the test is given at a different time, what kind of reliability is being tested?

Parallel Form / Equivalent Form / Alternate Form Reliability

If a test is split into two parts to check if each part shows consistency, what kind of reliability is being tested?

Split Half Reliability

There are different ways to measure the internal consistency of a test. Given the measures below, provide the item type that best suits it. 1. Kuder-Richardson 2. Cronbach’s Alpha 3. Interitem Correlation 4. Item Total Correlation

1. Kuder-Richardson = binary choice 2. Cronbach’s Alpha = affective assessments 3. Interitem Correlation = item # vs item # 4. Item Total Correlation = item # vs. total score

If Prof. Remedios wants to measure his exam's temporal stability and consistency of responses, what reliability test should he use?

Parallel Form / Equivalent Form / Alternate Form Reliability

If Prof. Remedios wants to measure his exam's temporal stability, what reliability test should he use?

Test-Retest Reliability

Your students tell you that your exam was VERY EASY. What range of scores should your exam display on the difficulty index?

0.76 - 1.00

Your students tell you that your exam was AVERAGE. What range of scores should your exam display on the difficulty index?

0.25 - 0.75

Your students tell you that your exam was VERY DIFFICULT. What range of scores should your exam display on the difficulty index?

0.00 - 0.24

This type of validity is done by examining the physical appearance of the test

Face validity

This type of validity determines if the test content covers a representative sample of the behavior domain to be measured.

Content Validity

The best way to establish the validity of a cognitive test is to use?

Content Validity

How is content validity conducted?

Through consultation with experts

This type of validity is used when hiring job applicants, selecting students for admission to college (entrance exams), or assigning military personnel to occupational training programs.

Criterion-prediction validity

This type of validity checks whether different tests that are supposed to measure the same thing give similar results.

Convergent Validity

This type of validity checks whether tests that are supposed to measure different things give different results.

Divergent Validity

This type of validity refers to how well a test or measurement tool actually measures the concept or psychological construct it is intended to measure.

Construct Validity

What is the key difference between Interval and Ratio scales?

Ratio has an absolute zero (e.g. weight, distance, age); while Internal has no absolute zero (e.g. temperature, IQ, dates)

If the scores in your class are positively skewed, what could that mean?

Most of your students scores are BELOW the mean; there are some extremely high scores. Aka most of the class is dumb; except for some

If the scores in your class are negatively skewed, what could that mean?

Most of your students scores are ABOVE the mean; there are some extremely low scores. Aka most of the class is smart; except for some

This measures the spread of scores.

Standard deviation

If your students all have the same score in an exam, the standard deviation is?

Zero

If half your students have extremely low scores and the other half have extremely high scores, then the standard deviation must be [adjective]

High

If your students have consistently average scores in your exam (with a few scoring either extremely low and/or high), then the standard deviation must be [adjective].

Low

What is the difference between contextualized and decontextualized assessments?

decontextualized = assesses memorization of facts and processes contextualized = assesses understanding

UBD, OBE, and OBTL agree that the first step in the instructional process is?

Identifying and clarifying learning outcomes

In UBD, the highest level of understanding is?

Students' awareness of what they don't understand (metacognition)

In UBD, the lowest level of understanding is?

Providing explanations

[blank] are competencies that are transferrable between jobs

Transversal competencies

These are assessments that ask students to perform real-world tasks.

Authentic assessments

The three feasible methods of assessing learning in the affective domain are?

teacher observation student self-report peer rating

If you want to assess students' reactions to concepts in terms of bipolar scales defined with contrasting objectives at each end, [blank] is the most appropriate tool to use.

Semantic differential

Assessment Flashcards

(71 cards)