Psychological Measurement Exam 4 Flashcards

1
Q

a statistic indicating how many test takers responded correctly to an item.

A

item-difficulty index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If 80% got item correct then item-difficulty index is ____.

A

.8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The larger item-difficulty index, the _____the item.

A

easier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

You want an item difficulty to be halfway between that and 1, ex. for a 5 option multiple choice item, the probability of guessing correctly is .20, so the optimal item difficulty is therefore ____.

A

.60
.20+1.00=1.20
1.20/2=.60

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

provides an indication of the internal consistency of a test

A

item-reliability index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

the higher the _____, the greater the tests internal consistency.

A

item-reliability index

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 3 test construction approaches?

A
  1. Rational approach
  2. Empirical approach
  3. Rational with empirical refinement approach
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Making up statements about traits of a personality to tap every aspect

Ex. I am depressed once a month
I am depressed a couple times a month
I am depressed once a week
I am depressed a couple times a week
I am depressed everyday

This is easy to construct, easy to fake.

A

Rational approach to test construction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Uses 2 criterion groups, 1 normal and 1 exhibits trait that you want to tap into. Come up with an item pool of questions, give to both groups. Determine which questions are answered statistically significantly different.

Hard to fake because the questions are random and they don’t know what they’re being tested for.

Limitation - p

A

Empirical approach within test construction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Use rationality to come up with questions then run tests with 2 criterion groups and distinguish.

A

Rational with empirical refinement approach within test construction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 5 steps in test development?

A
  1. Test conceptualization
  2. Test construction
  3. Test tryout
  4. Item analysis
  5. Test revision
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Coming up with an idea that a test ought to be designed to measure [fill in the blank ] in [such and such ] way.

A

Test conceptualization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Preliminary research surrounding the creation of a prototype of the test.

A

Pilot work

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Process of setting rules for assigning numbers in measurement.

A

Scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Grouping of words, statements, or symbols on which judgements of the strength of a particular trait, attitude, or emotion are indicated by the test taker.

A

Rating scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Summative rating scale.
Presents test taker with five alternative responses.
Ex. Never, rarely, sometimes, usually, always
(each assigned a value [agreement])

A

Likert scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Test takers are presented with pairs of stimuli (2 photos, 2 objects, 2 statements) which they are asked to compare. Must select one stimuli according to some rule (that they agree more with, etc.)

A

Method of paired comparisons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Printed cards, drawings, photos, objects, etc. are presented for evaluation.

A

Sorting tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Compare stimulus with others (ex. Rank)

A

Comparative scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Stimuli are placed in 1 of 2 or more categories.

A

Categorical scaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Yields ordinal level measures.
Ex.
1. All people should have the right to decide whether they wish to end their lives.
2. People who are terminally ill and in pain should have the option to have a doctor assist them in ending their lives.
3. People should have the option to sign away the use of artificial life support equipment before they become seriously ill.
4. People have the right to a comfortable life.
All who agree with 1, also agree with 2, 3, 4…etc.

A

Guttman scale

22
Q

Reservoir or well from which items will or will not be drawn from for the final version of the test.

23
Q

Form, plan, structure, arrangement, and layout of individual test items.

A

Item format

24
Q

Requires test takers to select a response from a set of alternative responses.

A

Selected-response format

25
Requires test takers to supply or to create the correct answer, not select it.
Constructed response format
26
test taker is presented with two columns: premises on the left and responses on the right, determine which response is best associated with which premise.
Matching Item
27
Multiple choice that contains only 2 possible responses. Ex: True/False
Binary Choice Item
28
Requires the examinee to provide a word or phrase that completes a sentence.
Completion Item
29
word, term, sentence, or paragraph
Short Answer Item
30
test item that requires the test taker to respond to a question by writing a composition, typically one that demonstrates recall of facts, understanding, analysis, and/or interpretation.
Essay Item
31
large and easily accessible collection of test questions often used by teachers.
Item Bank
32
interactive, computer-administered test taking program process wherein items presented to the test taker are based in part on the test takers performance on previous items.
Computerized Adaptive Testing (CAT)
33
diminished utility of an assessment tool for distinguishing test takers at the low end of the ability, trait, or other attribute being measured.
Floor Effect
34
diminished utility of an assessment tool for distinguished test takers at the high end of the ability, trait, or other attribute being measured.
Ceiling Effect
35
ability of the computer to tailor the content and order of presentation of test items on the basis of responses to previous items.
Item Branching
36
Most commonly cumulative. -Cumulative-Higher score=higher test taker is on the ability, trait, or other characteristic that the test purports to measure.
Scoring Items
37
test taker responses earn credit toward placement in a particular class or category with other test takers whose pattern of responses is presumably similar in some way.
Class/Category Scoring
38
comparing a test takers score on one scale within a test to another scale within that same test.
Ipsative Scoring
39
tried on people who are similar in critical respects to the people for whom the test was designed.
Test Tryout
40
different types of statistical scrutiny that the test data can potentially undergo at this point.
Item Analysis
41
statistic indicating how many test takers responded correctly to an item. Example: If 80% got item correct then item-difficulty index is .8 Larger item-difficulty index=easier the item. For a true/false item, the optimal item difficulty is .75 .50+1.00=1.5 1.5/2=.75 For 5 option multiple choice item, optimal is .6 .20+1.00=1.2 1.2/2=.6
Item-Difficulty Index
42
provides an indication of the internal consistency of a test; the higher the index-the greater the tests internal consistency. Item-Reliability= Sx (rxt) Sx being the item score standard deviation rxt being the correlation between item score and the total test score
Item-Reliability Index
43
statistic designed to provide an indication of the degree to which a test is measuring what it purports to measure. Higher validity index=greater the test’s criterion-related validity. Item-Validity= Sx (rxc) Sx being the item score standard deviation rxc being the correlation between item score and the criterion score
Item-Validity Index
44
a measure of discrimination, symbolized by a lowercase italic (d). Compares performance on a particular item with performance in the upper and lower regions of a distribution of continuous test scores. Measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly. Ranges from -1 to +1 Higher d, the more it discriminates.
Item-Discrimination Index
45
Optimal boundaries of lower and upper _________ are top 27% and bottom 27% of the distribution of scores.-If it’s normal.
Item-Discrimination Index
46
Formula: Passing scores in top - passing scores in bottom N
Item-Discrimination Index
47
a graphic representation of item difficulty and discrimination. Extent to which an item discriminates high from low scoring examinees is apparent from the slope of the curve. The steeper the slope=the greater the item discrimination.
Item Characteristic Curves
48
throw out items that are too easy or too hard and don’t discriminate. Keep those that do discriminate low from high scores and demonstrate reliability. If more questions are needed, go through items that were tossed out and pick out those that are repairable, revise and rewrite them. Redo process, revise standard conditions as well
Test Revision
49
the revalidation of a test on a sample of test takers other than those on whom test performance was originally found to be a valid predictor of some criterion.
Cross validation
50
the decrease in item validity that inevitably occurs after cross validation of findings (most likely due to chance).
Validity Shrinkage
51
as soon as test is taken out of original general context, validity goes down.
Generaliability