Used to refer to broader class of items. Also referred as binary items, two-option items, or alt-choice items. Guidelines: - Include only 1 idea in each item. - Avoid determiners: never, always, usually (you can argue bc they're trying to trick you) - Keep at same length.

Placed on the left of column are called "premises," and on the right are called "responses." Guidelines: - Homogenous materials, if not, it'll be too easy (i.e. same chapter/topic/idea, keeps things together). - More responses than premises (i.e. extra item that has no match). - Keep list short. - Responses brief and alphabetical order (makes it easier to find, i.e. Validity would be found where V is).

A test item that poses a question or problem for the examinee to respond to in an open-ended written format. Guidelines: - Clearly specified assessment task - Restrictive response items (i.e. shorter essay questions than broad topics). - Developed comprehensive rubric for grading (knowledge dumping might happen). - Limit to objective responses that cannot be measured using selective response items.

Requires the examinee to supply a word, phrase, number, or symbol as the answer. Guidelines: - Require short answer, very limited answer. - Ensure there is one correct response. - Use direct format (asks a question) than an incomplete sentence (fill in the blank). - If using blanks, use it at the beginning of the sentence. - Incomplete sentences use only one blank. - Quantitative answers: precision to avoid losing puts bc of rounding error. - Scoring rubric for each item.

Exam 2 Flashcards by Janice Budiono

Item Formats

This distinction usually referred to how the items were scored (either in an objective or subjective manner).

Objective: high level of agreement on whether the time has been answered correctly or in the keyed direction.
-i.e. Multiple Choice, T/F, and lists

Subjective: where much disagreement might exist.
-i.e. Essays, Individual Emotions

How well did you know this?

Not at all

Perfectly

Selected Response Item

A more direct approach to classify items. Requires an examinee to select a response from available alternatives (i.e. MC, T/F, and Matching). Less work, more broad.

Weakness:

difficult and time consuming to develop
does not assess all abilities
people can guess/cheat

Strengths:

can include more items; flexible
can be scored in an efficient, objective, and reliable manner
decrease influence of certain construct-irrelevant factors that can impact test scores (i.e. writing ability on test measuring scientific knowledge)

How well did you know this?

Not at all

Perfectly

Constructed Response Item

Requires examinees to create or construct a response (i.e. Fill-in-the-blank, Short Answer, and Essay). You’re allowed to provide your own answer instead of just picking one from a list.

Weakness:

difficult to get the reliability score
vulnerable to outside factors (i.e. poor writers/2nd language)
vulnerable to feigning
less number of problems

Strengths:

eliminate guesses
higher order of thinking
easier to write

How well did you know this?

Not at all

Perfectly

General Writing Item Guidelines

1) Provide clear guidelines.
2) Present the question, problem, or task in as clear and straightforward a manner as possible.
3) Develop items and tasks that can be scored in a decisive manner.
4) Avoid inadvertent cues to answer.
5) Arrange the items in a systematic way.
6) Ensure that individual items are contained on one page.
7) Tailor the items to the target population.
8) Minimize the impact of construct-irrelevant factors.
9) Avoid using the exact phrasing from study materials.
10) Avoid using biased of offensive language.
11) Use a print format that is clear and easy to read.
12) Determine how many items to include (consider time availability, age of examinees, types of items - mc or t/f or essay, type and purpose of the test, and scope of the test).

How well did you know this?

Not at all

Perfectly

Maximum Performance Tests

The best policy is that you want twice as many questions you will actually use.

Selected Response Items:

MC
T/F
Matching

Constructed Response:

Essay
Short Answer
Fill in the blank

How well did you know this?

Not at all

Perfectly

Multiple Choice Items

The most popular of the selected-response items largely because they can be used in a variety of content areas and can assess both simple and complex objectives.

Guidelines:

Use a format that makes the item as clear as possible.
The item stem should contain all the information necessary to understand the problem or question.
Provide between three and five alternatives.
Keep the alt. brief and arrange them in an order that promotes efficient scanning.
Should avoid negatively stated stems.
Make sure only one alt. is correct or clearly represents the best answer.
All alt. should be grammatically correct relative to the stem.
All distracters should appear plausible.
Use alt. positions in a random manner for the correct answer.
Minimize the use of “none of the above” and avoid using “all of the above.”
Limit the use of “always” and “never” in the alt.

How well did you know this?

Not at all

Perfectly

True/False Items

Used to refer to broader class of items. Also referred as binary items, two-option items, or alt-choice items.

Guidelines:

Include only 1 idea in each item.
Avoid determiners: never, always, usually (you can argue bc they’re trying to trick you)
Keep at same length.

How well did you know this?

Not at all

Perfectly

Matching Items

Placed on the left of column are called “premises,” and on the right are called “responses.”

Guidelines:

Homogenous materials, if not, it’ll be too easy (i.e. same chapter/topic/idea, keeps things together).
More responses than premises (i.e. extra item that has no match).
Keep list short.
Responses brief and alphabetical order (makes it easier to find, i.e. Validity would be found where “V” is).

How well did you know this?

Not at all

Perfectly

Essay Items

A test item that poses a question or problem for the examinee to respond to in an open-ended written format.

Guidelines:

Clearly specified assessment task
Restrictive response items (i.e. shorter essay questions than broad topics).
Developed comprehensive rubric for grading (knowledge dumping might happen).
Limit to objective responses that cannot be measured using selective response items.

How well did you know this?

Not at all

Perfectly

Short-Answer Items

Requires the examinee to supply a word, phrase, number, or symbol as the answer.

Guidelines:

Require short answer, very limited answer.
Ensure there is one correct response.
Use direct format (asks a question) than an incomplete sentence (fill in the blank).
If using blanks, use it at the beginning of the sentence.
Incomplete sentences use only one blank.
Quantitative answers: precision to avoid losing puts bc of rounding error.
Scoring rubric for each item.

How well did you know this?

Not at all

Perfectly

Typical Response Tests

The assessment of feelings, thoughts, self-talk, and other covert behaviors is best accomplished by self-report, such as personality and attitude scales. No right or wrong answer, just trying to find what’s typical.

Items:

T/F
Rating - frequency (always/never/sometimes, daily/weekly/monthly)
Likert - degree of agreement, for example assessing attitudes (agree/neutral/disagree)

Guidelines:

Items focus on thoughts, feelings, and behaviors - not facts.
Limited to a single thought, feeling, or behavior.
Avoid statements that everyone will lean towards in a specific manner.
Include items that are worded in both +/- directions.
Use an appropriate number of options.
Weigh benefits of using an off or even number of options.
For rating and Likert, clearly label options.
Minimize the use of specific determiners.
For young children, structure the scale as an interview.

How well did you know this?

Not at all

Perfectly

Phase I: Test Conceptualization

Conduct a Review of Literature and Develop a Statement of Need for the Test
Describe the Proposed Uses and Interpretations of Results from the Test
Determine Who Will Use the Test and Why
Develop Conceptual and Operational Definitions of Constructs You Intend to
Measure
Determine Whether Measures of Dissimulation Are Needed and, If So, What Kind

How well did you know this?

Not at all

Perfectly

Phase II: Specification of Test Structure and Format

Designate the Age Range Appropriate for the Measure
Determine and Describe the Testing Format
Describe the Structure of the Test
Develop a Table of Specifications (TOS) - definitions/explain how to do something
Determine and Describe the Item Formats and Write Instructions for Administration
and Scoring (open-ended or fixed/close-ended)
Develop an Explanation of Methods for Item Development, Tryout, and Final Item
Selection

How well did you know this?

Not at all

Perfectly

Phase III: Planning Standardization and Psychometric Studies

Age range appropriate for this measure
Testing format (i.e. individual or group; print or computerized); who will complete the test (i.e. the examiner, the examinee, or some other informant)
The structure of the test (i.e. subscales, composite scores, etc.) and how the items and subclass will be organized
Written table of specifications (TOS - blueprint to the content of the test)
Item formats and summary of instructions for administration and scoring: a) Indicate the likely number of items required for each subtest or scale b) Indicate the type of medium required for each test (i.e. verbal cue, visual cue, physically manipulated objects or puzzle parts, etc.)
Written explanation of methods for item development (how items will be determined – will you need content experts to help write or review items?), tryout, and final item selection

How well did you know this?

Not at all

Perfectly

Phase IV: Plan Implementation

Reevaluate the Test Content and Structure
Prepare the Test Manual
Submit a Test Proposal

How well did you know this?

Not at all

Perfectly

Aptitude vs. Achievement Tests

Aptitude tests measure the cognitive abilities that individuals accumulate over their life time.

Achievement tests are designed to assess students’ knowledge or skills in a content domain in which they have received instruction.

How well did you know this?

Not at all

Perfectly

Aptitude and Intelligence Tests in School and Clinical Settings

Study These Flashcards

Providing alt measures of cognitive abilities that reflect information not captured by standard achievement tests or school grades.
Helping educators tailor instructions to meet a student’s unique pattern of cognitive strengths and weaknesses.
Assessing how well students are prepared to profit from school experiences.
Identifying clients who aren’t doing well in school because of a possible learning disability or other cognitive disorders.
Identify students for gifted and talented programs.
Provide a baseline against which other client characteristics may be compared.
Helping guide students and parents with educational and vocational planning.

Aptitude-Achievement Discrepancies

Study These Flashcards

Comparing a client’s performance on an aptitude test with his or her performance on an achievement test. This is to determine a specific learning disability.
Can usually be attributed simply to measurement error, differences in the content covered, and variations in student attitude and motivation on the different tests.

Response to Intervention (RTI)

Study These Flashcards

A new assessment strategy to identify students with specific learning disabilities in a timely manner, not waiting for them to fail before providing assistance.
Students are provided regular instruction. Then, those not progressing, get something more from a teacher. Then, those who still do not respond, either qualify for special education or for special education evaluation.
Benefits:
1) Provides help to struggling students sooner
2) RTI may result in fewer students receiving special education services.
Caveats:
1) Wide variability in how the RTI process is defined and applied
2) Problems assessing whether an individual has “responded” to an intervention.

Diagnosing Intellectual Disabilities/Mental Retardation

Study These Flashcards

Most common in a school setting.
Diagnosis requires:
1) Performance on an individually administered test of intelligence 2 or more standards deviations below the population mean
2) Significant deficits in adaptive behavior (self-help skills, activities of daily living, ability to communicate with everybody, etc.)
3) Evidence that these deficiencies in fiction occurred during the developmental period (before 18 yrs old).

Group Aptitude/Intelligence Tests

Study These Flashcards

Either group or alone.
Efficient.
Tests need to be measured by someone professional trained.
SB5 (Standard Binet): Designed for 2-85 years of age.
- 5 Factor Indexes:
  1) Fluid reasoning
  2) Knowledge
  3) Quantitive reasoning
  4) Visual-spatial processing
  5) Working memory
- 3 IQ’s: Verbal, Non-Verbal, and Full Scale (allows FSIQs higher than 160)

Individual Aptitude/Intelligence Tests

Study These Flashcards

Weschsler Intelligence Scale for Children - 4th Edition (WISC-IV)

Most popular used in clinical and school settings with children.
Revised every 10-12 years.
Takes approx. 2-3 hours to administer/score.
Administered by professionals.
Used for ages 6-16
4 Index Scores (FSIQ):
1) Verbal comprehension Index (VCI)
2) Perceptual Index (PRI)
3) Working memory Index (WMI)
4) Processing speed Index (PSI)
Includes 15 subtests.

Selecting Aptitude/Intelligence Tests

Study These Flashcards

Consider factors such as how the information will be used and how much time is available for testing.
- Tests should be chosen to provide information that answers important questions.

Understanding the Report of an Intellectual Assessment

Study These Flashcards

Section 1: Review of all the data gathered as a result o the administration and scoring of the intelligence test. Includes brief background info on client and several behavioral observations.
Section 2: Provides some caveats regarding proper administration and use of the results of the intellectual assessment. This will clue in the reader in to the assumptions that underlie the interpretation of the results that follow later in the report.
Section 3: Provides a narrative summary of client’s scores on the intellectual assessment and provides norm-referenced interpretations.
Section 4: Pattern of client’s intellectual development is discussed.
Section 5: Deals with feedback and recommendations. Provides general understanding of implications of findings in.

Reports of the assessment of a client will include a thorough assessment of intellectual functions and mental status, personality, and behavior that may affect functioning and specialized areas of cognitive abilities.

Observations

- Fundamental way of finding out the way the world around us work.

Uses for Observations

Purposes for assessing infants and young children: 1) Identify developmental delay. 2) Diagnose presence and extent of developmental problems. 3) Specific abilities and skills. 4) Determine appropriate intervention strategies. Quantitative (scale to measure): - Must be systematic. - "Structured" observation schedule: checklist, tally, what time and where Qualitative (purest form of observation): - "Unstructured" observation: an 'interpretive' or 'critical' perspective where the focus is on understanding the meanings participants, in the contexts observed, attribute to events and actions

Principles of Good Screening

- Remain object by not altering the environment (nonjudgmental, be aware of culture shock, making mistakes). - Understand what's going on, who interacts with whom. - Grasp how people work and are. - Careful planning. - Avoid distortions from other distractions/ideas.

Disadvantages and Advantages

Disadvantages: - Not practical - Observer bias and effect (may change how the person act because of the awareness of the observer) - Time-consuming - Unreliable Advantages: - Gives direct information - Flexibility in diversity and applicability - Provision of permanent record allowing further analysis across time - Effectively complement other approaches and thus enhance the quality of evidence available to the researcher establishing validity

Threats to Reliability and Validity

- Expectancy Effect: knowing the aim and hypothesis - Observer Omission: personal bias, missed something - Selective Attention: consciously or unconsciously focus on one thing more than the other - Faulty Memory/Attention Deficit - Recency Effect: remembers what's more recent - Halo Effect: First impressions really guide your latter observations. - Central Tendency: avoids extreme rating (i.e. picks all neutral) - Observer Drift: observer starts to redefine the observational variables, to the extent that the data no longer reflects the original definitions of the observed units - Reactivity effects: when the presence/behaviour of the researcher might alter the participants’ observed behavior. - Counter-transference: distancing oneself from the phenomenon under investigation might help with problems that occur when the observer’s own judgements about the phenomenon and people they observe affect their observations.

Threats to Normative Development

- Genetics - Vision - Hearing - Iron deficiency - Lead Screening (too much)

Types of Assessments for Children

1) Parental Report 2) Observation with limited number of activities 3) Professionally administered developmental tool

Difficulties of Assessing Infants and Young Children

They're fussy, short attention span, limited communication/language skill, environment limitations, doesn't have full cognitive abilities to compare, and doesn't show pride and other emotions.

What is an essay item and examples?

A test item that poses a question or problem for the examinee to give an answer in an open-ended written format. Example: what is the difference between objective and subjective formats?

Which is not an example of a constructed-response item?

C. Multiple Choice

Which one of the following is NOT classified as a constructed response item?

B. True-False Items

List four steps of PHASE lll of test development:

1. Specify sampling plan for standardization. 2. Determine choice of of scaling methods and rationale. 3. List components of test. 4. Briefly outline validity studies to be performed and their rationale.

Describe what objective and subjective test items are and the differences between the two. Then give examples for both.

Objective test items are test items that includes a question or problem and answer options to choose from. The question or problem should only have one answer. For example, multiple choice, true/false, and matching. Subjective test items that asks a question or give a problem that require the examinee to provide his or her own answers. For example, essay items and short-answer items. One difference include that objective test items are time consuming to develop, while subjective test items are easier to develop. Another difference is that while objective test items are easier and faster to score, subjective items eliminates guesses and require a higher order of thinking.

Select the answer choice that is not a requirement for a test proposal submission:

A. Empirical research

List three commonly used binomial items used on typical response tests.

1. T/f 2. Rating scale 3. Likert scale

What are three things to avoid when developing general items?

1. Avoid inadvertent cues to answer 2. Avoid using exact phrasing from study materials 3. Avoid using offensive biased language

What is "dissimulation?"

Concealment of one's thoughts and feelings, or character.

Exam 2 Flashcards

(41 cards)