Exam 2 Flashcards
Designed to measure the typical behavior and characteristics of examinees
Typical response test
Item in which all scores agree on the score for the item
Objective item
Item in which disagreement may exist between scorers
Subjective item
Example of subjective items
Short answer, essays
Sample of objective items
Multiple-choice true false matching
Type of test that measures knowledge and skills in an area in which instruction has been provided
Achievement test
A type of test that measures cognitive abilities and skills that are accumulated as a result of overall life experience
Aptitude test
Type of test in which performance reflects differences in the speed of performance
Speed test
Type of test in which performance reflects the difficulty of the items the examinee is able to answer correctly
Power test
Type of personality test that uses items that are not influenced by the subjective judgment of the person scoring the test
Objective personality test
Type of test that involves the presentation of ambiguous material that elicits an almost infinite range of responses
Projective personality test
Types of maximum performance tests
Achievement/aptitude
Objective/subjective
Speed/power
Types of typical response test
Objective/projective
Item requires examinee to select a response from multiple alternatives
Selected response
Item requires examinee to create or construct a response
Constructed response
Strengths of selected response
You can typically include a relatively large number of selected-response items in your test
They can be scored in an efficient, objective, and reliable manner
They are flexible and can be used to assess a wide range of abilities
They can reduce the influence of certain construct-irrelevant factors
Limitations associated with the use of selected response items
They are relatively difficult to write
They are not able to assess all abilities
They are subject to random guessing
Strengths of constructed response items
Compared to selected response items they are easier to write
they are well suited for assessing higher order cognitive abilities and complex task performance
They eliminate random guessing
Limitations of constructed response items
They require more time then selected response to answer and therefore cannot include as many items in a test
They are more difficult to score in a reliable manner
They are vulnerable to feigning
They are vulnerable to the influence of construct-irrelevant factors
12 Item writing guidelines
Provide clear directions Use clear east to understand language Develop items that can be scored in a decisive manner Avoid cues to answers Arrange items in a systematic manner Try to contain similar items on the same page Tailor items to target population Minimize construct-irrelevant factors Avoid using exact phrasing from textbook Avoid biased language Make things easy to read Determine how many items to include
5 Types of items included in maximum performance tests
Multiple choice True/False Matching Essay Short answer
Type of multiple choice test when there is only one right answer
Correct answer test
Type of multiple choice test when there is more than one correct answer and the objective is to identify the best answer
Best answer format
Incorrect answers
distractors
What is the most popular type of selected response test
Multiple Choice
11 Guidelines for developing multiple choice items
Make item as clear as possible
Stem should contain all info necessary to understand problem
3-5 alternatives
Keep alternatives brief
Avoid negatively stated stems
Make sure there is only one correct choice
The alternatives should be grammatically correct with the stem
Distractors should appear plausible
Don’t make the same alternative the correct answer every time
Minimize the use of “non of the above”. Avoid using “all of the above”
Limit the use of “always” or “never” in the alternatives
What is the second most popular selected-response format
True-false
What is another name for true/false
Binary items
What are the 4 guidelines for developing True/false items
Avoid double barreled items
Avoid specific determiners/qualifiers that cue the answer
Make true and false statements the same length
Include equal number of true and false statements
Items that contain two columns of words and phrases.
Matching items
In matching items it is the column on the left for which the examinee seeks a match
premises
In matching it is the column on the right that the examinee uses to find a match
responses
6 Guidelines for matching items
Limit matching items to homogenous material
In the directions, indicate the basis for matching premises with responses
Include more responses than premises
Indicate if responses could be used more than once or not at all
Keep list brief
Ensure that responses are brief and follow a logical order
Poses a question or problem for the examinee to respond to in an open-ended written format
Essay item
Structured items that specify the form and scope of a response
Restricted response
Item in which there is no limit or form on the scope of response
Extended response
What is the benefit of restricted response items
Can be answered in a timely manner and are easier to score
What is the benefit of extended response items
Gives examinees more freedom in constructing a response
4 essay item guidelines
Clearly specify the assessment task
Use more restricted-response items in place of smaller number of extended response items
Develop and use a scoring rubric
Limit essay items to objectives that cannot be measured with selected response
Items that require the examinee to supply a word, phrase, number, or symbol in response to a direct question
Short answer item
Items written as incomplete sentences
Completion items
7 Guideline for short answer items
The response should be as short as possible
Make sure there is only one correct response
Direct-question format is preferable to the incomplete-sentence format
When using the incomplete-sentence format, it is best to have only one blank space, generally near the end of a sentence
Make sure that the blanks provide adequate space for response
For questions requiring quantitive answers, indicate the degree of precision expected
Create a scoring rubric and use it
Typical response item formats
True/false
Rating scale
Likert scale
Scale that focuses on frequency
Rating scale
Scale that focuses on degree of agreement
Likert scale
The percentage of test takers who correctly answer the item
Item difficulty index
What letter represents item difficulty index
P
What does P=
number of people who got the item correct over the number of total test takers
What is the range of the difficulty index
0.0-1.0
The percentage of examinees that responded to an item in a given manner
percent endorsement
Do easier items have higher or lower values
higher
To maximize variability and reliability items should be at what item difficulty index
.40-.60
On a mastery test what is the typical item difficulty index
.90
Why do mastery tests have such high difficulty indexes
because they are usually pass/fail and they indicate the extreme upper level of knowledge
Indicator of how well an item can differentiate among test takers who differ on the construct measured by the test
Item discrimination
What are the indexes of item discrimination
Discrimination index
Item-total correlation
What will the item discrimination likely indicate on a speed test
Where the item was placed in the exam because only the fastest examinees can reach the items at the end of the test
Index of the difference in performance between two groups
Discrimination Index
How do you calculate the discrimination index
Take the difficulty index of the item for each group, then subtract one group from the other
What letter represents the discrimination index
D
What is considered an excellent discrimination index
.40 or larger
What is considered a good discrimination index
.30-.39
What is considered a fair discrimination index
.11-.29
What is considered a poor discrimination index
0.00-.10
What does a negative discrimination index indicate
That an item was miskeyed or has a major flaw
How do you calculate the discrimination index for mastery testing
test two groups. one group receives instruction, the other does not. The same formula is used to compare the two groups
Correlation of the performance on the whole test to one item
Item-total correlation
Total number of items answered correctly including the item being looked at
Unadjusted item-total correlation
Total number of items answered correctly, omitting the item being examined
Adjusted item-total correlation
How is item-total correlation calculated
with point-biserial correlation
How does the item-total correlation indicate if an item measures the same construct as the test
The larger the item-total correlation, the more evidence that an item measures the same construct as the test
Does item-total correlation always indicate that your test measure what it intends to measure
No, it just shows that the item and the test measure the same thing
Examines how many people in the top and bottom groups selected each option on a multiple choice exam
Distractor analysis
Responses to items on a test that are accounted for by latent traits
Item response theory
A graph with ability on the X-axis and the probability of a correct response on the y-axis
Item Characteristic curve (ICC)
One parameter IRT model: items differ only by one parameter, difficulty
Rasch model
What do the lines look like on the Rasch model
They have the same slope
Items differ in both difficulty and discrimination
Two parameter model
What do the lines look like in a two parameter model
slopes are different
ICC model that even if the respondent has no “ability” there is still a chance that they’ll get the item correct
Three-parameter model
When two different groups respond differently to the same item. They have different slopes
Biased items
What do steep slopes on the ICC indicate
better discrimination between different abilities
A test that ensures that testing conditions are nearly the same as possible for all students
Standardized test
A test that assesses knowledge or skill and a content domain in which the participant has received instruction
Achievement test
What was A nation at risk
A study linking the success of a country to the success of the public education system
What did a nation at risk say about US students
They performed worse than other countries
What happened with no Child left behind
States required to develop standards
What happened with race to the top
To performing states were awarded with funding
Test that can be administered to more than one examinee at a time
Group administered test
What are the three pros of group administered tests
Efficient: large sample, short time
Typically can be scored objectively
Uniform testing conditions
What are the three cons of group administered tests
Limited qualitative behavioral observation
Lack of flexibility
Items restrict type of responses
Comprehensive batteries designed to assess achievement in multiple academic areas such as, but not limited to, reading, language arts, math, science and social studies
Commercial standardized achievement tests
Who are the three publishing companies that are most likely to produce commercial standardized achievement tests
CTB McGraw-Hill
Pearson assessment
Riverside publishing
What test does CTB produce
California achievement tests
Terranova CTBS
Terranova second edition
What two tests does Pearson assessment publish
Stanford achievement test series
Metropolitan tests of achievement
What to test does riverside publishing produce
Iowa tests of basic skills
Iowa tests of educational development
Test designed for use with students from kindergarten through grade 12 and is described as a traditional achievement battery
California achievement test – fifth edition (CAT/5)
Test designed for use with students from kindergarten to grade grade 12 and was published in 1997. It combines selected response and constructed response items that allow students to respond in a variety of formats
Terranova CTBS
A comprehensive modular achievement battery designed for use with students from kindergarten to grade 12 and contains year 2005 normative data
Terranova the second edition
Test for use with students from kindergarten to grade 12 and has your 2007 normative data
Stanford achievement test series
Test that can be used with students from kindergarten through grade 12 it assesses content in reading math Madix language science and social science it is untimed and can be administered with the Otis Lennon school ability test
Metropolitan test of achievement
Test designed for use with students from kindergarten through grade 8 and as the name suggests is designed to provide a thorough assessment a basic academic skills
Iowa test of basic skills
Intended for use with students from grades nine through 12 and was published in 2001. Is designed to measure the long-term goals of secondary education
Iowa tests of educational development
Supplement to the standardized tests that typically only employee selected response questions
Diagnostic constructed response and performance assessments
Provides a larger number of items for each specific learning objective
Diagnostic achievement tests
What do proponents say about high-stakes testing
It increases expectations and equals fair judgment
What do critics say about high-stakes testing
Neglect critical thinking and problem-solving, and teachers teach to the test
What are a states choices for standardized testing
They can choose either commercial off-the-shelf tests or develop their own battery or use a combination of both
What are the six best practices to prepare students for tests
Do not teach to the test Teach generic test taking skills Use practice forms of the test Develop class assignments that prep students for format of standardized test Emphasize content of test Present material using different formats
Achievement test that is administered to one student at a time and provides a more thorough assessment of skills and wider variety of item formats
Individual achievement test
Five types of individual achievement test
Wechsler individual achievement test Woodcock – Johnson III test of achievement Wide range achievement test 4 Gray oral reading test (fourth edition) KeyMath
Individual achievement test that looks at reading comprehension, mathematics, written language, and oral language
Wechsler individual achievement test
Individual achievement test that covers broad reading, oral language, broad math, math calculation skills, broad written language, written expression
Woodcock – Johnson III tests of achievement
Individual achievement test that covers word reading, reading comprehension, spelling, arithmetic
Wide range achievement test 4
Individual achievement test that measures oral reading skills and is often used to diagnose reading difficulties
Gray oral reading test
Individual achievement test that measures mathematics skills in the area of basic concepts, operations, and applications
KeyMath
What are the five principles a teacher uses to develop a test
Specify educational objectives Develop test blueprint Determine how scores will be interpreted Select item formats Determine how students will be graded
What type of test is the most popular and widely used aptitude test in psychology
Intelligence tests
Abilities such as problem-solving, abstract reasoning, ability to acquire knowledge
Intelligence
How is intelligence represented
Through intelligence quotients
What are contemporary intelligence test a reliable indicator of
Academic success
What is the most frequently used measure of intelligence
Wechsler Bellevue
What was the first test to combine verbal and nonverbal abilities on the same test
Wechsler Bellevue
Who is responsible for ministering, scoring, and interpreting intelligence tests
Teachers, school counselors, and psychologist
Compares a clients performance on an aptitude test with their performance on an achievement test
Aptitude – achievement discrepancy analysis
Achievement score is higher then aptitude score
Overachiever
What accounts for someone being an overachiever
Studied hard, worked really hard,
Why is being an overachiever problematic
Because they have great knowledge in this one domain, but they might not be able to generalize it to other things
Achievement score is lower than aptitude score
Underachiever
What might account for someone being an underachiever
Environmental factors, illness, poor teaching, ADHD
What do critics of ability achievement comparison say
Discrepancies can be attributed to measurement error, differences in content covered, and students attitudes
An alternative to AADA when looking for students who are doing well in the subject
Response to intervention
What are the five steps of response to intervention
Students are provided with effective teaching
Progress is monitored
Students who do not respond get something else from the teacher
Progress is monitored
Students who do not respond qualify for special education
What is a pro of response to intervention
Provides help to struggling students sooner, before they start to fail. It distinguishes between students who have learning difficulties and students who have poor instruction
What are the three requirements of diagnosing an intellectual disability
Performance on intelligence test must be two standard deviation below the mean
Significant deficits in adaptive behavior including self help skills, daily living and communication
Evidence that these deficiencies and function occurred during the developmental period Before age 18
What is the IQ range from mild disability/difficulty
55 to 70
What is the IQ range for moderate, severe or profound intellectual difficulty
Below 55
What age are these difficulties typically diagnosed
Before the age of five or six
Test designed to assess the upper limits of the examinee’s knowledge
Maximum performance test
List of all 7 of the group tests
Test of cognitive skills, second edition (TCS/2)
Otis – Lennon school ability test, eighth edition (OLSAT-8)
Personal and vocational assessment
Primary test of cognitive skills (PTCS)
InView
Cognitive abilities test (COGAT)
College admission tests
Covers verbal, nonverbal, memory abilities: verbal reasoning, memory, sequences, and analogies. Age range from 2 to 12
TCS/2
Test of cognitive skills
Contest that has for some test: verbal, special, memory, and concepts. No meaning or number knowledge required comers age kindergarten to first grade
Primary test of cognitive skills
Group test that covers verbal reasoning, nonverbal reasoning, and quantitative reasoning – new version of TCS age range 2 to 12
InView
Test that covers verbal and nonverbal processes. For use with grades K through 12
Otis-Lennon school ability test
Test that measures verbal, quantitative, and nonverbal reasoning ability. Test change based on grade level. For use with grades K-12
Cognitive abilities test
Group test for use in personal and vocational testing entirely verbal measures vocabulary development, reasoning skills for use with ages 18 and over
MAT
Group test for use for college admission. Critical reading, math, writing for use with grades nine through 12
College admission tests
What are the four individual aptitude tests
Wechsler Intelligence Scale for Children
Stanford-Binet Intelligence scales
Woodcock-Johnson III Test of Cognitive Abilities
Reynolds Intellectual Assessment Scales
Most popular individual test of intellectual ability for children.
Wechsler Intelligence Scale for Children
The first intelligence test to gain widespread acceptance. Has the expanded IQ scale that allows the calculation of IQs higher than 160
Stanford Binet
Test based on Cattell-Horn-Carroll theory of cognitive abilities
Woodcock Johnson
Has the ability to obtain a reliable, valid measure of intellectual ability that incorporates both verbal and nonverbal abilities in a relatively brief period
Reynolds Intellectual Assessment scales
4 principals of selecting aptitude tests
How information will be used
Time available for testing
Population that will be tested
Psychometric properties of the test
A test that attempts to measure the typical behavior and characteristics of examinees
Typical response
Test that involves the presentation of unstructured or ambiguous stimuli that allows almost infinite responses
Projective personality test
Test responses that misrepresent a person’s true characteristics
Response sets and dissimilation
When a person responds and either negative or positive manner
Response set
When a person purposefully misrepresent themselves
Dissimulation
Transient emotional states that fluctuate over time
State
Stable internal characteristic that is manifested as a tendency for an individual to behave in a particular manner
Trait
Items that are included to help combat and identify response sets and dissimilations
Validity scales
What is an example of three validity scales
F index
L Index
V index
Index composed of items that are infrequently endorsed
F index
Index composed of items that are in frequently endorse that also identify individuals with a social desirability bias
L index
Index that includes nonsensical items if examinees select these items, they may be careless during test taking, or it might indicate a learning difficulty
V index
Develop items based on their apparent relevance to the construct measured
Content rational approach
Limitation of the content rational approach
Examinees can easily manipulate that results to present themselves in a specific way
A process in which a large pool of items is administered to two groups one typically a clinical group composed of individuals with a specific diagnosis and the other a control or normal group representative of the general population
Empirical criterion keying
What is an example of an empirical criterion keying
Minnesota multiphasic personality inventory he (MMPI)
Statistical approach that evaluates the presence and structure of latent constructs existing among a set of variables
Factor analysis
Five factor model (big five)
Neuroticism, extroversion, openness, agreeableness, conscientiousness
Number of objective personality scales that have been developed based on a specific theory of personality
Theoretical approach
What are two examples of theoretical approach to personality tests
Myers-Briggs and Millon clinical multiaxial inventory
Individuals preference to focus on the M world of thoughts and ideas
Introversion
Preference to focus on the external world of thoughts and ideas
Extroversion
Preference to focus on what can be perceived by the five senses
Sensing
Preference for basing decisions on a logical analysis of the facts
Thinking
Preference for basing decisions on personal values and situational factors
Feeling
Preference for structure and decisiveness
Judging
Preference for flexibility and adaptability
Perceiving
I or E
Introversion or Extroversion
S or N
Sensing or Intuition
T or F
Thinking or Feeling
J or P
Judging or perceiving
Most popular self report measure among school psychologists. High scores reflect some sort of pathology or abnormality
Self-report of personality
What does the self-report personality measure
Inattention/hyperactivity, internalizing problems, school problems, personal adjustment
Four examples of projective tests
Projective drawings
Sentence completion tests
Apperception tests
Inkblot techniques
Three types of projective drawings
Draw a person
House tree person
Kinetic family drawing
Client is given a blank sheet of paper and asked to draw a whole person the figure in the drawing is thought to represent the self
Draw a person
Client is asked to draw a house, tree, and person of each gender. Thought to tap into home life
House tree person
Diet is asked to draw everyone in their family. Design to tap into view a family and interactions
Kinetic family drawing
Give the examinee sentence and asked them to finish it
Sentence completion
Client has shown a picture and asked to make up a story about it
Apperception test
What is the most widely used apperception test
Thematic apperception test
Examinee presented with an ambiguous inkblot and asked to interpret it in some manner
Inkblot