Test 2: Chapters 6, 7, and 9 Flashcards

1
Q

What three things must you consider before you begin writing test items?

A
  1. Type of test (multiple choice, essay, etc.)
  2. The responses you want (The type of test you write depends on the responses you want)
  3. The objectives of the test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

List the test item writing techniques that are recommended in the ch. 6 outline.

A
  1. Define clearly what you want to measure
    • read a great deal of theory to know what you want to measure, and write questions as specifically as possible
  2. Generate an item pool
    • Write 3 or 4 similar items then select the one that is the most specific and captures the measure
  3. Make sure the reading level is appropriate
  4. Mix positively and negatively worded items “acquiescence response set”
  5. Always take into account cultural, racial, ethnic, and gender differences
  6. Constantly reevaluate tests because they can lose reliability and validity over time.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are things that you should avoid when writing test items?

A
  1. Avoid redundancy
    • Generating an item pool prevents this
  2. Avoid lengthy items
    • These can be both confusing and misleading
  3. Avoid “double-barreled” items
    * Do not put two or more questions in the same item
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 6 different types of item formats?

A
  1. The dichotomous format
  2. The polytomous format
  3. The likert format
  4. The category format
  5. Checklists
  6. Q-Sorts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the dichotomous format?

Is it used often?

A

2 alternatives for each item (true/false)

*Not used as often due to its limitations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are pros and cons of the dichotomous format?

A

Pros:

  • easy to construct
  • easy to administer
  • easy to score
  • easy to paraphrase lines out of a textbook or lecture notes

Cons:

  • require absolute judgement (only one answer is right) even though sometimes both alternatives may have truth to them
  • 50% chance of being correct but this is not mastery
  • encourages memorization vs true material comprehension
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the polytomous format?

*Is it used often?

A

More than two alternatives; only one alternative is correct (multiple choice tests)
*very popular in the educational setting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are pros and cons of the polytomous format

A

Pros:
-easy to score
-less likely than dichotomous item to be correct due chance
-can cover large amt. of material in short time
-correct answers are less likely to be the result of chance (ex. 25% chance rather than 50% chance with dichotomous format)
Good distractors (incorrect choices) can increase the reliability of a test

Cons:
Ineffective distractors decrease reliability and validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When is it okay to guess if a correction for guessing formula is used?

A

When you have narrowed your responses to two choices.
*Tests that use correction for guessing formulas dock off an extra point for incorrect answers. Therefore, a question that is left blank loses less points than a question that is answered incorrectly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How would you assess reliability and validity of an essay exam?

A
  1. inter-rater agreement: lets test maker know if their questions are subjective
  2. Correlate the essay with other tests for validity
    * Note: When writing essay questions, have clear instructions, clearly express grading criteria, and ask a peer to review your questions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the Likert format.

*This format is typically used in what types of scales?

A
  • Test-taker evaluates agreement or disagreement
  • Five alternatives: 1. strongly disagree 2. disagree 3. Neutral 4. agree 5. strongly agree

*This format is often used for personality and attitude scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is one pro and one criticism of the likert format?

A

Pro: very easy to subject to factor analysis (find item groups that cluster together)

Criticism: Some believe that parametric statistics should not be used for this format because the data are ordinal and not at the interval level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the category format. Provide an example.

A

The category format usually involves 10 point rating scales
-1 is usually low and 10 high
-don’t have to use 1-10.
Ex. on a scale of 1 to 10, rate your level of pain.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ratings change depending on _________.

A

Context.

ex. I would get a 1 if compared to Michael Jordan and a 5 if compared to Cici in basketball.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can you improve discriminability in tests that use the category format?

A
  • as a test administrator, give people an idea of what 1 means and what 10 means (ex. show them a film)
  • as a test taker, be more invested in what you are rating
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the visual analogue scale? Is it used very often?

A

In Visual Analogue Scales, test-takers are asked to place a mark on a line to rate something.
It is NOT used very often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a checklist? What are the cons of the checklist format? Is this format used very often?

A

Checklist: people select adjectives from a list.
Cons:
- people may define adjectives differently
-sometimes people are different based on context
-usually have only two adjectives to choose from (ex. brave vs afraid, shy vs outgoing)
*adjective checklists are popular in personality measurement, but checklists are falling out of favor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a Q-sort? What do scored items look like on a graph? Which item responses are of interest to test administrators?

A
  • With Q-sorts, test-takers are given statements, and are instructed to place each statement in 1 of 9 piles.
  • Items look like a bell-shaped curve; most items fall in the middle categories (4 and 5)
  • Test administrators are interested in item responses that fall in the extreme categories (1 & 2, 8 & 9).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a primary difference between checklists and Q-sorts?

A

Q-sorts discriminate; checklists do not discriminate (Q-sorts increase the number of categories)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is item analysis?

A

How you evaluate your items (ex. item difficulty, discriminability, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Item difficulty depends on what two factors?

A
  1. The use of the test

2. The types of items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is considered the optimal difficulty for a test item?

Should this level of difficulty apply to all test items? Why or why not?

A

.625 (62.5% correctly answer the item)
No. To increase the validity (those who study and comprehend the meaning and those who do not), test items should be discriminated at different levels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

A difficulty range between ___ and ___ discriminates between students.

A

.30 and .70 (.30 =harder items, .70 =easier items)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The statements, “If I want to raise self-esteem, I may make an easier test.” and “If I am selecting medical school candidates, then the items are going to be harder.” are examples of __________ (which need to be considered when deciding item difficulty.)

A

Human Factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What effect will adding a few easier items have on a test?

A

A few easier item may help test anxiety, and increase reliability and validity.
*This is especially true when the easier items are placed at the beginning of the test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What would a test maker look at when determining the discriminability of test items?

A

They would look at the relationship between the test item and whole test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What method compares those who have done well with those who have done poorly?

A

The Extreme Group Method

28
Q

What is the Discrimination Index?

A

The proportion of people who got items correct or incorrect in the extreme groups.

29
Q

For what is the Point Biserial Method used?

When do you NOT want to use this method?

A

The point biserial method is used to correlate one item with the whole test.
Do not use the point biserial method if the test has too few items.

30
Q

What might a negative or low correlation between a test item and the whole test tell you about what you should do with the test item?

A

You should eliminate or change the item.

31
Q

What is the primary purpose of the Item response theory?

A

Efficiency

32
Q

What does a negative correlation between a test item and the whole test indicate about that test item?
What does a low correlation between a test item and the whole test indicate about the test item?

A

Negative Correlation = the question was bad.

Low correlation = only students who studied really hard correctly answered the question OR it was a bad question

33
Q

Can you link uncommon measures (ex. SAT and ACT)?

Why or why not?

A

No. They have different psychometric properties and linking them is not meaningful.

34
Q

What are statistical transformation formulas used for? Do they work?

A

They are supposed to make linkages between uncommon measures. They are not helpful.

35
Q

What are criterion referenced tests?

A

They are tests where the test maker defines the criterion of knowledge for an individual person to master the material (sets a clear set of objectives for the student to pass); the person’s performance is NOT to be compared to other people.

36
Q

For what are criterion referenced tests typically used in the school setting?

A

They are popular in individualized instruction programs (Ex. IEP)

37
Q

What is the antimode?

A

It is the criterion through which the tester decides what constitutes passing.

  • Note: you cant just identify goals, you have to determine how to help the person meet those goals
38
Q

What are some limitations of Item Analysis?

A
  1. Although it can tell how to separate students, it can’t give information on how to learn information
  2. While it assesses learning, it doesn’t diagnose mistakes
  3. It often leads teachers to the phenomenon of “Teaching to the Test”
39
Q

Aside from random error, what are potential sources of error in test administration that are better able to be controlled?

A
  1. testing situation
  2. tester characteristics
  3. test-taker characteristics
40
Q

What is one function of the standardized approach?

A

It is considered to reduce bias (error).

41
Q

Generally speaking, what are the two main influences of the relationship between the examiner and test taker on test scores?

A
  1. RAPPORT with test taker can affect results

2. test administrator’s performance expectations may influence test takers

42
Q

Does the race of the tester influence scores?

A

No. The impact of tester race on IQ tests is non-significant (for both individual and group intelligence tests.)

43
Q

What factors have been associated with the score discrepancies in African American test takers?

A
  1. African American children earned higher scores under thematic conditions
  2. African American children performed better on the Otis-Lennon School Ability Test if they could both read and listen to test items. (*this points to a validity problem because the test is testing reading skill rather than mere ability
44
Q

Generally speaking, what factors aid in minimizing racial bias?

A

-more standardized and specific testing procedures

45
Q

Does research indicate if test scores are influenced by the use of paraprofessional interviewers as opposed to psychologists?

A

Yes. Score discrepancies may occur when paraprofessionals administer tests.

46
Q

Is test translation recommended for administering a test to an individual who has difficulties with the English language? Why or why not?

A

No. It threatens reliability and validity of the test

47
Q

If a test taker knows more than one language, and versions of the test are available in different languages, which version should you give to them?

A

The version that is written in the language that the test taker is most comfortable with.

48
Q

In order to gain competence in administering the WAIS-R, graduate students should administer the test at least _____times.

A

10

49
Q

What are Expectancy Effects?

A

Test scores may be affected by what test administrators expect—Rosenthal effects (“maze bright” and “maze dull” rats)

  • Usually “subtle” communication between experimenter and subject
  • Expectancy can affect IQ scoring
50
Q

What types of variables form the bases through which we judge others?

A

Cognitive and Interpersonal Variables (ex. one study suggested that “warm” test takers were given higher scores.)

51
Q

ch. 9

A

begin

52
Q

According to Alfred Binet, Intelligence encompasses what three components?

A
  1. Assumes a definite direction (Intelligent people can figure out goals.)
  2. Able to adapt and adjust one’s self to reach a desired goal
  3. Has the capacity for self-criticism or evaluation — “autocriticism”
53
Q

How did Spearman define intelligence?

A

Ability to find relationships among things

Ex. “If I go to class, pay attention, and study, I will do well in this class.”

54
Q

According to Freeman, intelligence encompasses what three components?

A
  1. The ability to adapt to one’s WHOLE environment, not just parts of it
    Ex. Sheldon, from the “Big Bang Theory” isn’t intelligent according to Freeman. This is because he lacks social skills and the ability to function in society.
  2. The capacity to learn
  3. The ability to think in the abstract
55
Q

How does Das define intelligence?

A

Intelligence is determined by whether or not a person can carefully plan and organize their behavior with a goal in mind.

56
Q

What is Gardner’s definition of intelligence?

A

The ability to solve “real-life” problems when they occur.

You may have a genius IQ, but if you can’t call a plumber when your toilet is clogged, you aren’t intelligent.

57
Q

According to Sternberg, intelligence requires mental activities involved in _______.

A

“mental activities involved in purposive adaptation to, shaping of, and selection of real-world environments relevant to one’s life.”

In other words, adapting to, modifying, and choosing environments that will support life goals

58
Q

Anderson’s definition of intelligence uses a two dimensional model. Describe these two dimensions.

A
  1. people demonstrate individual differences in how fast they process information
  2. people are different in terms of their executive functions, which is an inhibitory process
59
Q

Recent theories of intelligence also include what dimensions?

A

Memory and personality dimensions

60
Q

What are the three research traditions used to study human intelligence?

A
  1. The psychometric approach
  2. The Information Processing Approach
  3. The Cognitive Tradition
61
Q

What research tradition looks at how we learn and solve problems?

A

Information Processing Approach

62
Q

What research tradition focuses on how people adjust to situations and demands that they encounter in the “real world?”

A

Cognitive Tradition

63
Q

What research tradition thoroughly reviews the different properties of a test “through an evaluation of its correlates and underlying dimensions” (reliability and validity)?

A

The Psychometric Approach

64
Q

Currently, testing relies heavily on which tradition?

A

The psychometric approach

65
Q

Do our social and educational backgrounds have an effect on our IQ scores?

A

Yes. There is a correlation between SES and IQ on all standardized tests.

66
Q

Where did intelligence testing and special education originate?

A

France. The French Minister and Alfred Binet accomplished this task together.