Psychological Measurement Exam 4 Flashcards by Katie Thibeaux

a statistic indicating how many test takers responded correctly to an item.

item-difficulty index

How well did you know this?

Not at all

Perfectly

If 80% got item correct then item-difficulty index is ____.

How well did you know this?

Not at all

Perfectly

The larger item-difficulty index, the _____the item.

easier

How well did you know this?

Not at all

Perfectly

You want an item difficulty to be halfway between that and 1, ex. for a 5 option multiple choice item, the probability of guessing correctly is .20, so the optimal item difficulty is therefore ____.

.60
.20+1.00=1.20
1.20/2=.60

How well did you know this?

Not at all

Perfectly

provides an indication of the internal consistency of a test

item-reliability index

How well did you know this?

Not at all

Perfectly

the higher the _____, the greater the tests internal consistency.

item-reliability index

How well did you know this?

Not at all

Perfectly

What are the 3 test construction approaches?

Rational approach
Empirical approach
Rational with empirical refinement approach

How well did you know this?

Not at all

Perfectly

Making up statements about traits of a personality to tap every aspect

Ex. I am depressed once a month
I am depressed a couple times a month
I am depressed once a week
I am depressed a couple times a week
I am depressed everyday

This is easy to construct, easy to fake.

Rational approach to test construction

How well did you know this?

Not at all

Perfectly

Uses 2 criterion groups, 1 normal and 1 exhibits trait that you want to tap into. Come up with an item pool of questions, give to both groups. Determine which questions are answered statistically significantly different.

Hard to fake because the questions are random and they don’t know what they’re being tested for.

Limitation - p

Empirical approach within test construction

How well did you know this?

Not at all

Perfectly

Use rationality to come up with questions then run tests with 2 criterion groups and distinguish.

Rational with empirical refinement approach within test construction

How well did you know this?

Not at all

Perfectly

What are the 5 steps in test development?

Test conceptualization
Test construction
Test tryout
Item analysis
Test revision

How well did you know this?

Not at all

Perfectly

Coming up with an idea that a test ought to be designed to measure [fill in the blank ] in [such and such ] way.

Test conceptualization

How well did you know this?

Not at all

Perfectly

Preliminary research surrounding the creation of a prototype of the test.

Pilot work

How well did you know this?

Not at all

Perfectly

Process of setting rules for assigning numbers in measurement.

Scaling

How well did you know this?

Not at all

Perfectly

Grouping of words, statements, or symbols on which judgements of the strength of a particular trait, attitude, or emotion are indicated by the test taker.

Rating scale

How well did you know this?

Not at all

Perfectly

Summative rating scale.
Presents test taker with five alternative responses.
Ex. Never, rarely, sometimes, usually, always
(each assigned a value [agreement])

Likert scale

How well did you know this?

Not at all

Perfectly

Test takers are presented with pairs of stimuli (2 photos, 2 objects, 2 statements) which they are asked to compare. Must select one stimuli according to some rule (that they agree more with, etc.)

Method of paired comparisons

How well did you know this?

Not at all

Perfectly

Printed cards, drawings, photos, objects, etc. are presented for evaluation.

Sorting tasks

How well did you know this?

Not at all

Perfectly

Compare stimulus with others (ex. Rank)

Comparative scaling

How well did you know this?

Not at all

Perfectly

Stimuli are placed in 1 of 2 or more categories.

Categorical scaling

How well did you know this?

Not at all

Perfectly

Yields ordinal level measures.
Ex.
1. All people should have the right to decide whether they wish to end their lives.
2. People who are terminally ill and in pain should have the option to have a doctor assist them in ending their lives.
3. People should have the option to sign away the use of artificial life support equipment before they become seriously ill.
4. People have the right to a comfortable life.
All who agree with 1, also agree with 2, 3, 4…etc.

Guttman scale

Reservoir or well from which items will or will not be drawn from for the final version of the test.

Item pool

Form, plan, structure, arrangement, and layout of individual test items.

Item format

Requires test takers to select a response from a set of alternative responses.

Selected-response format

Requires test takers to supply or to create the correct answer, not select it.

Constructed response format

test taker is presented with two columns: premises on the left and responses on the right, determine which response is best associated with which premise.

Matching Item

Multiple choice that contains only 2 possible responses. Ex: True/False

Binary Choice Item

Requires the examinee to provide a word or phrase that completes a sentence.

Completion Item

word, term, sentence, or paragraph

Short Answer Item

test item that requires the test taker to respond to a question by writing a composition, typically one that demonstrates recall of facts, understanding, analysis, and/or interpretation.

Essay Item

large and easily accessible collection of test questions often used by teachers.

Item Bank

interactive, computer-administered test taking program process wherein items presented to the test taker are based in part on the test takers performance on previous items.

Computerized Adaptive Testing (CAT)

diminished utility of an assessment tool for distinguishing test takers at the low end of the ability, trait, or other attribute being measured.

Floor Effect

diminished utility of an assessment tool for distinguished test takers at the high end of the ability, trait, or other attribute being measured.

Ceiling Effect

ability of the computer to tailor the content and order of presentation of test items on the basis of responses to previous items.

Item Branching

Most commonly cumulative. -Cumulative-Higher score=higher test taker is on the ability, trait, or other characteristic that the test purports to measure.

Scoring Items

test taker responses earn credit toward placement in a particular class or category with other test takers whose pattern of responses is presumably similar in some way.

Class/Category Scoring

comparing a test takers score on one scale within a test to another scale within that same test.

Ipsative Scoring

tried on people who are similar in critical respects to the people for whom the test was designed.

Test Tryout

different types of statistical scrutiny that the test data can potentially undergo at this point.

Item Analysis

statistic indicating how many test takers responded correctly to an item. Example: If 80% got item correct then item-difficulty index is .8 Larger item-difficulty index=easier the item. For a true/false item, the optimal item difficulty is .75 .50+1.00=1.5 1.5/2=.75 For 5 option multiple choice item, optimal is .6 .20+1.00=1.2 1.2/2=.6

Item-Difficulty Index

provides an indication of the internal consistency of a test; the higher the index-the greater the tests internal consistency. Item-Reliability= Sx (rxt) Sx being the item score standard deviation rxt being the correlation between item score and the total test score

Item-Reliability Index

statistic designed to provide an indication of the degree to which a test is measuring what it purports to measure. Higher validity index=greater the test’s criterion-related validity. Item-Validity= Sx (rxc) Sx being the item score standard deviation rxc being the correlation between item score and the criterion score

Item-Validity Index

a measure of discrimination, symbolized by a lowercase italic (d). Compares performance on a particular item with performance in the upper and lower regions of a distribution of continuous test scores. Measure of the difference between the proportion of high scorers answering an item correctly and the proportion of low scorers answering the item correctly. Ranges from -1 to +1 Higher d, the more it discriminates.

Item-Discrimination Index

Optimal boundaries of lower and upper _________ are top 27% and bottom 27% of the distribution of scores.-If it’s normal.

Item-Discrimination Index

Formula: Passing scores in top - passing scores in bottom N

Item-Discrimination Index

a graphic representation of item difficulty and discrimination. Extent to which an item discriminates high from low scoring examinees is apparent from the slope of the curve. The steeper the slope=the greater the item discrimination.

Item Characteristic Curves

throw out items that are too easy or too hard and don’t discriminate. Keep those that do discriminate low from high scores and demonstrate reliability. If more questions are needed, go through items that were tossed out and pick out those that are repairable, revise and rewrite them. Redo process, revise standard conditions as well

Test Revision

the revalidation of a test on a sample of test takers other than those on whom test performance was originally found to be a valid predictor of some criterion.

Cross validation

the decrease in item validity that inevitably occurs after cross validation of findings (most likely due to chance).

Validity Shrinkage

as soon as test is taken out of original general context, validity goes down.

Generaliability