Exam 2 Flashcards
Item Formats
This distinction usually referred to how the items were scored (either in an objective or subjective manner).
Objective: high level of agreement on whether the time has been answered correctly or in the keyed direction.
-i.e. Multiple Choice, T/F, and lists
Subjective: where much disagreement might exist.
-i.e. Essays, Individual Emotions
Selected Response Item
A more direct approach to classify items. Requires an examinee to select a response from available alternatives (i.e. MC, T/F, and Matching). Less work, more broad.
Weakness:
- difficult and time consuming to develop
- does not assess all abilities
- people can guess/cheat
Strengths:
- can include more items; flexible
- can be scored in an efficient, objective, and reliable manner
- decrease influence of certain construct-irrelevant factors that can impact test scores (i.e. writing ability on test measuring scientific knowledge)
Constructed Response Item
Requires examinees to create or construct a response (i.e. Fill-in-the-blank, Short Answer, and Essay). You’re allowed to provide your own answer instead of just picking one from a list.
Weakness:
- difficult to get the reliability score
- vulnerable to outside factors (i.e. poor writers/2nd language)
- vulnerable to feigning
- less number of problems
Strengths:
- eliminate guesses
- higher order of thinking
- easier to write
General Writing Item Guidelines
1) Provide clear guidelines.
2) Present the question, problem, or task in as clear and straightforward a manner as possible.
3) Develop items and tasks that can be scored in a decisive manner.
4) Avoid inadvertent cues to answer.
5) Arrange the items in a systematic way.
6) Ensure that individual items are contained on one page.
7) Tailor the items to the target population.
8) Minimize the impact of construct-irrelevant factors.
9) Avoid using the exact phrasing from study materials.
10) Avoid using biased of offensive language.
11) Use a print format that is clear and easy to read.
12) Determine how many items to include (consider time availability, age of examinees, types of items - mc or t/f or essay, type and purpose of the test, and scope of the test).
Maximum Performance Tests
The best policy is that you want twice as many questions you will actually use.
Selected Response Items:
- MC
- T/F
- Matching
Constructed Response:
- Essay
- Short Answer
- Fill in the blank
Multiple Choice Items
The most popular of the selected-response items largely because they can be used in a variety of content areas and can assess both simple and complex objectives.
Guidelines:
- Use a format that makes the item as clear as possible.
- The item stem should contain all the information necessary to understand the problem or question.
- Provide between three and five alternatives.
- Keep the alt. brief and arrange them in an order that promotes efficient scanning.
- Should avoid negatively stated stems.
- Make sure only one alt. is correct or clearly represents the best answer.
- All alt. should be grammatically correct relative to the stem.
- All distracters should appear plausible.
- Use alt. positions in a random manner for the correct answer.
- Minimize the use of “none of the above” and avoid using “all of the above.”
- Limit the use of “always” and “never” in the alt.
True/False Items
Used to refer to broader class of items. Also referred as binary items, two-option items, or alt-choice items.
Guidelines:
- Include only 1 idea in each item.
- Avoid determiners: never, always, usually (you can argue bc they’re trying to trick you)
- Keep at same length.
Matching Items
Placed on the left of column are called “premises,” and on the right are called “responses.”
Guidelines:
- Homogenous materials, if not, it’ll be too easy (i.e. same chapter/topic/idea, keeps things together).
- More responses than premises (i.e. extra item that has no match).
- Keep list short.
- Responses brief and alphabetical order (makes it easier to find, i.e. Validity would be found where “V” is).
Essay Items
A test item that poses a question or problem for the examinee to respond to in an open-ended written format.
Guidelines:
- Clearly specified assessment task
- Restrictive response items (i.e. shorter essay questions than broad topics).
- Developed comprehensive rubric for grading (knowledge dumping might happen).
- Limit to objective responses that cannot be measured using selective response items.
Short-Answer Items
Requires the examinee to supply a word, phrase, number, or symbol as the answer.
Guidelines:
- Require short answer, very limited answer.
- Ensure there is one correct response.
- Use direct format (asks a question) than an incomplete sentence (fill in the blank).
- If using blanks, use it at the beginning of the sentence.
- Incomplete sentences use only one blank.
- Quantitative answers: precision to avoid losing puts bc of rounding error.
- Scoring rubric for each item.
Typical Response Tests
The assessment of feelings, thoughts, self-talk, and other covert behaviors is best accomplished by self-report, such as personality and attitude scales. No right or wrong answer, just trying to find what’s typical.
Items:
- T/F
- Rating - frequency (always/never/sometimes, daily/weekly/monthly)
- Likert - degree of agreement, for example assessing attitudes (agree/neutral/disagree)
Guidelines:
- Items focus on thoughts, feelings, and behaviors - not facts.
- Limited to a single thought, feeling, or behavior.
- Avoid statements that everyone will lean towards in a specific manner.
- Include items that are worded in both +/- directions.
- Use an appropriate number of options.
- Weigh benefits of using an off or even number of options.
- For rating and Likert, clearly label options.
- Minimize the use of specific determiners.
- For young children, structure the scale as an interview.
Phase I: Test Conceptualization
- Conduct a Review of Literature and Develop a Statement of Need for the Test
- Describe the Proposed Uses and Interpretations of Results from the Test
- Determine Who Will Use the Test and Why
- Develop Conceptual and Operational Definitions of Constructs You Intend to
Measure - Determine Whether Measures of Dissimulation Are Needed and, If So, What Kind
Phase II: Specification of Test Structure and Format
- Designate the Age Range Appropriate for the Measure
- Determine and Describe the Testing Format
- Describe the Structure of the Test
- Develop a Table of Specifications (TOS) - definitions/explain how to do something
- Determine and Describe the Item Formats and Write Instructions for Administration
and Scoring (open-ended or fixed/close-ended) - Develop an Explanation of Methods for Item Development, Tryout, and Final Item
Selection
Phase III: Planning Standardization and Psychometric Studies
- Age range appropriate for this measure
- Testing format (i.e. individual or group; print or computerized); who will complete the test (i.e. the examiner, the examinee, or some other informant)
- The structure of the test (i.e. subscales, composite scores, etc.) and how the items and subclass will be organized
- Written table of specifications (TOS - blueprint to the content of the test)
- Item formats and summary of instructions for administration and scoring: a) Indicate the likely number of items required for each subtest or scale b) Indicate the type of medium required for each test (i.e. verbal cue, visual cue, physically manipulated objects or puzzle parts, etc.)
- Written explanation of methods for item development (how items will be determined – will you need content experts to help write or review items?), tryout, and final item selection
Phase IV: Plan Implementation
- Reevaluate the Test Content and Structure
- Prepare the Test Manual
- Submit a Test Proposal
Aptitude vs. Achievement Tests
Aptitude tests measure the cognitive abilities that individuals accumulate over their life time.
Achievement tests are designed to assess students’ knowledge or skills in a content domain in which they have received instruction.