Chapter 13: Testing in Schools Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What are three things that preschool assessments determine?

A

Readiness of the child to enter school

Identification/diagnosis of conditions that my present special educational challenges

Assessment of the child’s abilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three objectives of preschool assessments?

A
  1. Screening of children at risk
  2. Diagnostic assessment to determine the presence of absence of a particular condition, often for the purpose of establishing eligibility for placement in special programs, as well as to formulate intervention
    and treatment recommendations
  3. Program evaluation – where the test results are used to document and evaluation specific programs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Review brief history

A

Public Law (PL) 94-142 – mandated the professional evaluation of children age 3 and older suspected of having physical or mental disabilities in order to determine their special educational needs (mid
1970s)

PL 99-457 – obligation extended downward to birth (1986)
* Starting with the 1990-1991 school year, all disabled children from ages 3 to 5 will be provided with free, appropriate education.

PL-105-17 – gave greater attention to diversity issues (1997)
* Infants and toddlers with disabilities must receive services in the home or other natural settings, and continued in preschool programs

In 1999, ADHD was officially listed under “Otherwise Health Impaired” as a disabling condition that can qualify a child for special services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some issues with testing preschoolers?

A
  • Language and conceptual skills emerge but are not
    advanced enough to be assessed using traditional tests.
  • The attention span of a preschooler is short.
  • Motivation in the child may vary from one test session to the next.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Curriculum Based Assessment

A
  • Observe and record a student’s performance on a set of activities
  • There are several different ways to accomplish this
  • Some methods take a general approach
    o Determine the learning components of a learning
    construct
    o Select a wide variety of tasks or items to assess
    the learning components
    o Example:
     Spelling is the learning construct
     Select words from the list of words students are expected during the course of the school year
     Assess how well the student has mastered the skill

Some methods take a more specific approach
o Determine whether a student has attained proficiency with one
particular aspect of the curriculum
o Breaks down the global learning outcomes into a set of specific
subskills
o Example:
 Spelling is the learning construct
 Select the specific skill of words ending with a silent e
 Ask the students to spell words that fall within the specific skill
 Assess how well the student has mastered the specific skill

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

5 Assessment Approaches

A
  • Interview parents and teachers (most widely used)
  • Behavioral Observational (most valuable)
  • Rating scales completed by the parent and/or teacher (quick and inexpensive)
  • Projective techniques (limited use with young children)
  • Traditional tests normed on children of the same age
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Psychometric tests

A

Piagetian-Based Scales

Should be used for classification and placement decisions, but not to determine a child’s level of cognitive development and ability

  1. Piaget’s stages of development
    Measures a child’s cognitive levels in accord with
  2. Comprehensive Developmental Assessment Tools
    Checklists based on normal child development
  3. Process-Oriented Assessment Approaches
    Assumes identification of cognitive strategies is
    necessary to understand cognitive performance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Other Issues

A
  • How equivalent are the instruments?
  • Reliability is low
  • Need other test forms for children with special requirements
  • Assess “readiness” in terms of social and emotional skills
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

California Achievement Tests (CAT)

A

Assessment in the Primary Grades

  • Determine which specific skills have mastered or not
  • Compare students’ performance with that of a national sample

Interrelationship of subtests
* High intercorrelations between subtests (.50 - .80)
Reliability is satisfactory
Only content validity has been focused on
Fall and Spring norms are provided

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

California Achievement Tests (CAT): Locator Tests

A

Use of locator tests – gives the student a short 20-item version of vocabulary and math items to see what level a student is at, then can administer the full test
Scores are comparable across grades

To minimize boredom or discouragement it is a “locator test” meaning the child’s performance can be used as a guideline on which level to administer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

California Achievement Tests (CAT): Lake Wobegon Effect

A

schools are reporting that there students are
scoring above-average
Schools are comparing results with old norms
Norms rise over time because students learn more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HIGH SCHOOL: Social Competence

A
  • Development of the scale
  • Reliability was satisfactory
  • Validity
  • Compared scores with popularity ratings

Cavell and Kelley (1994) developed a self-report measure of social competence for adolescents
Students described situations that did not go well
yielded 157 problem situations
Rated the situations on a 5-point Likert scale for frequency and difficulty
Factor analysis yielded 7 labels:
1. Keep friends (friends shares your secret)
2. Problem Behavior (want to drink alcohol)
3. Siblings (embarrassed you)
4. School (mean teachers)
5. Parents (nosy)
6. Work (dislike but need it)
7. Make Friends (peers dislike)
Each are scored on frequency and difficulty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Tests of General Educational Development (GED): five types of tests

A

High school equivalency test
Five tests
1. Writing skills
2. Social studies
3. Science
4. Reading skills
5. Mathematics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Tests of General Educational Development (GED)
Reliability and Validity

A

Reliability is satisfactory
Content validity is built in

Concurrent validity
* Test is also give to high school students which shows it is more stringent
* Fairly high intercorrelations between tests

Predictive validity
* Difficult to assess
* Graduates do report increases in pay, acceptance into training programs, and other benefits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

National Assessment of Educational Progress (NAEP)

A

Designed to measure the distribution of proficiencies in national student populations
Wide range of school subjects
Given on a variable time table

The NAEP is a Congressionally mandated survey of American students’ educational achievement; it was first conducted in 1969, annually through 1980, and biennially since then.

The goal of the NAEP is to estimate educational achievement and changes in that achievement over time, for American students of specific ages, gender, and demo- graphic characteristics

The NAEP is designed to measure the distribution of proficiencies in student populations.

covers a wide range of school subject areas such as reading, mathematics, writing, science, social studies, music, and computer competence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Essay versus Multiple choice

A

Looking at AP exams, results showed that scores on the essay questions were less reliable and didn’t correlate very highly with the multiple choice questions

From a psychometric and a practical point of view, multiple-choice items are preferable because:
1. easy to score by machine
2. do not involve the measurement error created by subjective scoring
3. are more reliable and more amenable to statistical analyses
4. well-written items can assess the more complicated and desirable aspects of cognitive functioning.

The K- R 20 reliability for the American History exam, multiple-choice section was .90 and .89
The correlations between the multiple-choice and essay sections were .48 and .53
In other words, scores on the essay sections are not reliable and do not correlate highly with the scores on the multiple-choice sections.

17
Q

Scholastic Aptitude Test (SAT)

A

Two content areas
* Verbal (antonyms, analogies, sentence completion, and reading
comprehension
* Quantitative (unidimensional)
Outcome oriented test – total score based on correct answers
Always being revised
Test sophistication
* Items and directions are easy to read
* Teach everyone test taking strategies

18
Q

Scholastic Aptitude Test (SAT): Gender gap

A
  • Men obtain higher scores than women
  • Women obtain higher freshman year college GPAs
  • Men typically do better on the quantitative section
  • Women typically do better on the verbal section
19
Q

Scholastic Aptitude Test (SAT): Minority bias

A

Problem with false negatives
High school GPA is a better predictor by itself than SAT scores
Use a regression equation developed for Mexican-Americans to predictive validity
Is the SAT redundant with HS GPA or HS rank?

20
Q

Scholastic Aptitude Test (SAT)

A

Coaching – can impact the validity of a test if it changes scores
Results haven’t found much evidence
Criterion problem – is first-year grades what we want to predict?
Reliability is satisfactory
Validity generalization may be used
Is SAT score a measure of family income?
Is the SAT fair?

21
Q

Graduate Record Examination

A

Global measures of verbal, quantitative, and analytical reasoning abilities
Develop new items and add them as trial questions in administered tests
Reliability is satisfactory
Studies report low levels of validity
Subject tests do a better job in some departments
The GRE does an okay job predicting graduate GPA, but undergraduate GPA does a better job
Criterion problem – how do you operationally define graduate school success?
Range restriction
* Some studies have found higher validity coefficients when correlating the GRE scores with graduate school performance when the GRE wasn’t used to select applicants
Results have been mixed when using the GRE as a predictor of success in psychology
Is graduate school GPA an appropriate criterion?
We need to define what we are trying to predict

22
Q

Tests for Licensure and Certification

A

Licensure – government gives permission to an individual to engage in an occupation
* In order to obtain a license, an individual must meet a minimal level of competency
* Usually there are rules about what a licensed practitioner may do
Certification – an individual meets the qualifications set by a credentialing agency
* May use a designated title

Tests can be developed nationally or locally
Formats
* Multiple choice
* Work samples
Purpose
* To protect the public’s welfare and safety
* To assess a minimal level of competency

Validity
* Face validity is usually built in
* Criterion validity is more difficult because there is a diverse set of criteria

23
Q

Tests for Licensure and Certification: Cutoff scores

A
  • Have to determine what the minimal level of competency is
  • Should be consistent with a job analysis

Methods
* Human resources planning approach takes into account the following information to determine how applicants needed
* Projected personnel needs
* Past history of proportion of offers accepted
* Distribution of applicant test scores
* Based on Applicants test scores
Criterion-referenced
* Experts provide judgments
* Angoff method – minimum raw score for passing
* Ebel procedure – judges rate the relative importance of each Item
* Nedelsky method – identify those distractors that a
”minimally competent” person would recognize as incorrect
* Contrasted-groups method