Chapter 11 Flashcards
What is item analysis ?
How developers evaluate the performance of each test item
The percentage of test takers who respond correctly.
What should P be?
3pts
Item difficulty
- P should be around 0.50
- P= # of person who passed/ # of person who completed
Discrimination index?
D=?
2pts
Compares the performance of those who obtained very high test scores (the upper group/U) with the performance of those who obtained very low test scores (the lower group/L) on each item
D= U-L
A measure of the strength and direction of the relation between the way test takers responded to one item and the way they responded to all of the items as a whole.
What type of quantitative item analysis is this ?
What should the value be around ? What is the worst?
3pts
Item-total correlation
- Negative correlation is the worst
- Item-total correlation should be around 0.2 to 0.4
Displays the correlation of each item with every other item.
Interitem correlation matrix
The result of correlating two dichotomous (having only two values) variables.
What type of coefficient?
Phi coefficients
Test scores used to sort individuals into two or more categories based on their scores on the criterion measure.
What type of test is this?
Empirically based tests
When an item is easier for one group than for another group.
What type of bias is this?
Item bias
The degree to which an immigrant or a minority member has adapted to a country’s mainstream culture.
Acculturation
Test developers often ask test takers to complete a questionnaire about how they viewed the test itself and how they answered the test questions.
What type of analysis is this?
Qualitative analysis
The scores on a test taken by different subgroups in the population (e.g., men, women) need to be interpreted differently because of some characteristic of the test not related to the construct being measured.
Measurement bias
What are the 2 types of measurement bias- differential validity vs single-group validity?
Differential validity: A test that yields significantly different validity coefficients for subgroups
Single-group validity: A test that is valid for one group but not for another group (valid for whites but not blacks)
Dividing the number of persons who answered correctly by the total number of persons who responded to the question is a measure of an item’s ______.
A. discrimination index
B. phi coefficient
C. difficulty
D. bias
C. difficulty
What is the discrimination index?
A. a comparison of the scores of respondents by sex, race, or other personal characteristics
B. an index of how difficult each test item is
C. a comparison of high performer scores with low performer scores on each item
D. cumulative results from an item analysis yielding an overall score for the test
C. a comparison of high performer scores with low performer scores on each item
When test developers examine the discrimination indexes of each item, which one of the following outcomes do they consider being most desirable?
A. low positive numbers
B. high positive numbers
C. average positive numbers
D. low negative numbers
B. high positive numbers
How are phi coefficients interpreted?
A. the same as discrimination coefficients
B. the same as reliability coefficients
C. the same as validity coefficients
D. the same as Pearson product moment correlations
D. the same as Pearson product moment correlations
Which one of the following provides important information for increasing the test’s internal consistency?
A. discrimination index
B. difficulty level
C. interitem correlation matrix
D. coefficient of multiple correlation
C. interitem correlation matrix
A test question has item bias when it ______.
A. is easier for one group than for another group
B. has a high discrimination index
C. does not correlate with other test items
D. does not correlate with the test’s raw score
A. is easier for one group than for another group
Which one of the following statements about validation studies is TRUE?
A. The validation study should take place in one or more situations that match the actual circumstances in which the test will be used.
B. The pilot test can also serve as the validation study.
C. Only a small sample (less than 30) of the target audience is necessary for the validation study.
D. Calculating reliability is done during the item analysis and not during the validation study.
A. The validation study should take place in one or more situations that match the actual circumstances in which the test will be used.
The main purpose of the validation study is to ______.
A. get the reactions of test takers and test users
B. gather data on the construct(s) that the test measures
C. confirm the test’s ability to yield meaningful and accurate results
D. comply with legal requirements that tests must have evidence of validity
C. confirm the test’s ability to yield meaningful and accurate results
What are cut scores?
A. scores that would have been higher had it not been for test bias
B. mean, median, and mode of the norm distribution
C. decision points for dividing test scores into pass/fail groupings
D. transformed scores, such as z and T scores
C. decision points for dividing test scores into pass/fail groupings