Assessment Flashcards

1
Q

Differentiate measurement, assessment, and evaluation

A

measurement = assigning numerical values

assessment = collecting information

evaluation = making judgments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Diagnostic exams and achievement test results from the previous year.

Is this assessment before, during, or after instruction?

A

Before

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Formative assessments

Is this assessment before, during, or after instruction?

A

During

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Summative assessment

Is this assessment before, during, or after instruction?

A

After

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Prof. Remedios wants to use the traditional paper-and-pen test for the PSYNTRO final examinations because it’ll be easy for him to administer it to his students and later, grade it objectively.

However, what disadvantages should he be mindful of?

A
  1. Prep time is lengthy (seating arrangement, printing of test papers, developing items and answer key)
  2. Students may cheat
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Instead of a traditional exam, Prof. Remedios wants his PERDEV students to do a performance in groups for their final examinations. He believes the preparation for it will be easy on his end, and students cannot cheat their way out of it.

However, what disadvantages should he be mindful of?

A
  1. They may be graded subjectively if the rubrics aren’t used
  2. Watching the performances will take too much class time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

For their final exams, Prof. Remedios asks his GEUSELF students to submit a portfolio so that he can see the growth and development in their self-understanding.

What are the disadvantages of this assessment though?

A
  1. Making a portfolio is time-consuming
  2. Rating their portfolios may be subjective if rubrics aren’t used.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The Table of Specifications is a blueprint for selecting appropriate items.

What are its 4 parts?

A
  1. Weight/Time Frame
  2. Content Outline/Topics
  3. Learning Competencies
  4. No. of Items per Topic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In the TOS, what formula do you use to calculate a topic’s number of items per learning competency?

Ex: How many items for Remembering?

A

Items = Weight * Percentage of Learning Competency * Total Number of Test Items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or False

The longer the test, the less reliable it is.

A

False. Longer = more reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

True or False

In creating a test layout, it is recommended to weave difficult questions in between easy ones.

A

False. Arrange items from easy to difficult.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

True or False

Items of the same type must be grouped together.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In scoring a test, using [blank] is easier for large group of examinees

A

separate answer sheets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In scoring a test, using a [blank] key covers all parts of the test except for the exact areas where the answers are written.

A

punched

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In scoring a test, an [blank] key is placed on top of the answer sheet and makes the correct answer visible.

A

Overlay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In scoring a test, a [blank] key contains the correct answers that match the spaces where the answer is placed in the side of the test paper.

A

Strip

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True/False; Yes/No; Fact/Opinion

These are examples of what type of test format?

A

binary choice

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In providing a binary choice item, what disadvantage do you have to be mindful of?

A

There is a 50-50 probability of selecting the correct answer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the two types of multiple choice questions?

A
  1. Choose the correct answer
  2. Choose the best answer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the three parts of a multiple choice item?

A
  1. Stem (the question or prompt)
  2. Answer
  3. Distractors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

True or False

It is recommended that the stem of a multiple choice item be written in the positive form.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Matching type items are best to use when measuring the student’s ability to identify the relationship between a set of similar items.

What is one disadvantage of this item?

A

being able to guess correct answers through the process of elimination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

In creating a matching type item, what do you put in Column A and Column B?

A

Column A: premises (usually in a logical order or alphebatized)
Column B: responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the three Selective Type test formats?

A

Binary Choice
Multiple Choice
Matching Type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the two Constructed Response test formats?

A

Short Answer
Essay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Enumerate the four primary elements in Avatar: The Last Airbender

This is an example of what test format?

A

Short Answer - Constructed Response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are the two types of Essay items and how are they different from each other?

A

Restricted response (specifies length and scope of the essay)

Extended response (no length or scope)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the key difference between a holistic and analytic rubric?

A

Holistic rubrics have a weight for each criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What type of rubric is best for formative assessments?

A

Analytic rubric

30
Q

What is the difference between reliability and validity?

A

Reliability = consistency of scores across the conditions.

Validity = if the test measures what it intends to measure

31
Q

True or False

A valid test is always reliable.

A

True

32
Q

True or False

A reliable test is always valid.

A

False

33
Q

The reliability of a test is typically determined by using what coefficient?

A

Correlation coefficient (R)

34
Q

A test is considered reliable if the correlation coefficient is not less than what score?

A

0.85

35
Q

If a test shows a coefficient correlation (R) between 0.80 - 1.00, how can this be interpreted?

A

Very high relationship

36
Q

If a test shows a coefficient correlation (R) between 0.60 - 0.79, how can this be interpreted?

A

High relationship

37
Q

If a test shows a coefficient correlation (R) between 0.40 - 0.59, how can this be interpreted?

A

Substantial/marked relationship

38
Q

If a test shows a coefficient correlation (R) between 0.20 - 0.39, how can this be interpreted?

A

Low relationship

39
Q

If a test shows a coefficient correlation (R) between 0.00 - 0.19, how can this be interpreted?

A

Negligible relationship

40
Q

If the same test is given at different times, what kind of reliability is being tested?

A

Test-Retest Reliability

41
Q

If a different version of the test is given at a different time, what kind of reliability is being tested?

A

Parallel Form / Equivalent Form / Alternate Form Reliability

42
Q

If a test is split into two parts to check if each part shows consistency, what kind of reliability is being tested?

A

Split Half Reliability

43
Q

There are different ways to measure the internal consistency of a test. Given the measures below, provide the item type that best suits it.

  1. Kuder-Richardson
  2. Cronbach’s Alpha
  3. Interitem Correlation
  4. Item Total Correlation
A
  1. Kuder-Richardson = binary choice
  2. Cronbach’s Alpha = affective assessments
  3. Interitem Correlation = item # vs item #
  4. Item Total Correlation = item # vs. total score
44
Q

If Prof. Remedios wants to measure his exam’s temporal stability and consistency of responses, what reliability test should he use?

A

Parallel Form / Equivalent Form / Alternate Form Reliability

45
Q

If Prof. Remedios wants to measure his exam’s temporal stability, what reliability test should he use?

A

Test-Retest Reliability

46
Q

Your students tell you that your exam was VERY EASY.

What range of scores should your exam display on the difficulty index?

A

0.76 - 1.00

47
Q

Your students tell you that your exam was AVERAGE.

What range of scores should your exam display on the difficulty index?

A

0.25 - 0.75

48
Q

Your students tell you that your exam was VERY DIFFICULT.

What range of scores should your exam display on the difficulty index?

A

0.00 - 0.24

49
Q

This type of validity is done by examining the physical appearance of the test

A

Face validity

50
Q

This type of validity determines if the test content covers a representative sample of the behavior domain to be measured.

A

Content Validity

51
Q

The best way to establish the validity of a cognitive test is to use?

A

Content Validity

52
Q

How is content validity conducted?

A

Through consultation with experts

53
Q

This type of validity is used when hiring job applicants, selecting students for admission to college (entrance exams), or assigning military personnel to occupational training programs.

A

Criterion-prediction validity

54
Q

This type of validity checks whether different tests that are supposed to measure the same thing give similar results.

A

Convergent Validity

55
Q

This type of validity checks whether tests that are supposed to measure different things give different results.

A

Divergent Validity

56
Q

This type of validity refers to how well a test or measurement tool actually measures the concept or psychological construct it is intended to measure.

A

Construct Validity

57
Q

What is the key difference between Interval and Ratio scales?

A

Ratio has an absolute zero (e.g. weight, distance, age); while Internal has no absolute zero (e.g. temperature, IQ, dates)

58
Q

If the scores in your class are positively skewed, what could that mean?

A

Most of your students scores are BELOW the mean; there are some extremely high scores.

Aka most of the class is dumb; except for some

59
Q

If the scores in your class are negatively skewed, what could that mean?

A

Most of your students scores are ABOVE the mean; there are some extremely low scores.

Aka most of the class is smart; except for some

59
Q

This measures the spread of scores.

A

Standard deviation

60
Q

If your students all have the same score in an exam, the standard deviation is?

A

Zero

61
Q

If half your students have extremely low scores and the other half have extremely high scores, then the standard deviation must be [adjective]

A

High

62
Q

If your students have consistently average scores in your exam (with a few scoring either extremely low and/or high), then the standard deviation must be [adjective].

A

Low

63
Q

What is the difference between contextualized and decontextualized assessments?

A

decontextualized = assesses memorization of facts and processes

contextualized = assesses understanding

64
Q

UBD, OBE, and OBTL agree that the first step in the instructional process is?

A

Identifying and clarifying learning outcomes

65
Q

In UBD, the highest level of understanding is?

A

Students’ awareness of what they don’t understand (metacognition)

66
Q

In UBD, the lowest level of understanding is?

A

Providing explanations

67
Q

[blank] are competencies that are transferrable between jobs

A

Transversal competencies

68
Q

These are assessments that ask students to perform real-world tasks.

A

Authentic assessments

69
Q

The three feasible methods of assessing learning in the affective domain are?

A

teacher observation
student self-report
peer rating

70
Q

If you want to assess students’ reactions to concepts in terms of bipolar scales defined with contrasting objectives at each end, [blank] is the most appropriate tool to use.

A

Semantic differential