Lecture 3 - Item Analysis Flashcards

Question 1

Q

What is the objective of item analysis and what is the problem that is present?

Answer

A

• The purpose of item analysis is to find items that form an internally consistent scale and to eliminate those items that do not.
• A scale is only useful if it is able to detect differences among individuals on the measured construct.
• A good item is one that contributes variance to the test score.

• Problem: There is no yardstick to evaluate whether the
item variance is small or large…

Question 2

Q

What do we examine in a psychometric evaluation of items?

Answer

A

Item Means
Inter-item correlation
Corrected item-scale (or item-total) correlation
Coefficient alpha

Question 3

Q

What do extreme item means tell us?

Answer

A

• Extreme item means tells us that perhaps there is a problem with the item. It could have meant that
The item is worded too strongly or too weakly (hence, most
respondents agree/disagree); OR
The said event always (never) occurs for most respondents.

• Items with mean close to the middle of the score range (i.e., around
4 on a 7-pt scale) is therefore ideal, though we would also want to
retain a range of items with different mean/distributions.

Question 4

Q

What is the item difficulty index?

Answer

A

In objective items (e.g., mcqs) with correct/incorrect response or in binary items, item means is actually the proportion of respondents who answered an item correctly:

p = N_c/N

N — total number of respondents
N_c— number of respondents who answered correctly

Question 5

Q

Why is it preferable to have items with p = 0.5?

Answer

A

Variance = p(1-p)
Variance is highest when p = 0.5

Item difficulties for an entire test should spread out across the full range (i.e., to have some easy and some difficult items so as to tell apart the strong test takers).

Question 6

Q

• Why is inter-item correlation important?

Answer

A

Items with low variance correlates poorly with other items.
Items that inter-correlate with each other contribute additionally to the variance of the total scale. Why?
s^2_composite = s^2_i + s^2_j +2r_ij s_i s_j

• Clark & Watson (1995) argued that one should examine the range and distribution of all inter-item correlations because the mean inter-item correlations may be misleading.
- Recommended that all of the individual inter-item correlations should fall within the range of .15 to .50 depending on the construct measured.
- If construct is broad, mean interitem correlation —.15-.20
- If construct is narrow, mean interitem correlation —.40-.50

If inter-item correlation is greater than .70, item redundancy is a problem

Question 7

Q

We want the corrected item-total correlation where the item being evaluated is correlated with the sum of the remaining items in the scale (excluding the item itself). Why?

Answer

A

Doing so will artificially inflate the item-total correlation, especially when there are very few items in the scale

Question 8

Q

How high should the item-total correlation be for an item to be included in the scale?

Answer

A

No clear rule of thumb. Studies vary in their item-total correlation cutoff for item elimination from 0.35 to .5.
Some studies decided that the scale should have m items, then the m items with the largest item-scale correlation would be chosen.
Item-total correlation is usually considered together with coefficient alpha.

Question 9

Q

What does coefficient alpha reflect?

Answer

A

• Coefficient alpha reflects internal consistency reliability.
• Alpha takes on value from 0 to 1. When alpha is negative, something is wrong (such as negative interitem correlations).
• Common guidelines on how alpha value is judged:
DeVellis: minimally acceptable alpha: .70 (p.95)
Clack & Watson (1995): min. acceptable alpha: .80

Question 10

Q

What affects alpha?

Answer

A

Number of items in the scale
Interitem correlations
• With a longer scale, the resulting alpha has a narrower C.I. compared to alpha based on a shorter scale. This means that the reliability of alpha increases with the number of items.

Question 11

Q

• What is the trade-off of focusing on maximizing alpha?

Answer

A

Attenuation paradox - increasing internal consistency of a test, beyond a certain point, will not enhance construct validity and may even occur at the expense of validity

Strongly correlated items are likely to be highly redundant, and contribute little more construct information than any one item individually

Question 12

Q

• How many items should one aim for in a scale?

Answer

A

Depends on the dimensionality and whether the construct has a broad/narrow domain.
For broad concepts, a maximum should be approx around 35 items.
Striving a balance between reliability and brevity
- In the case of research instrument, there is little need to strive for higher reliability once a .80 reliability is obtained (Clark & Watson, 1995; Netemeyer, Bearden, & Sharma, 2003).
The minimum is for each factor or dimension to include at least 3 items.

Question 13

Q

• What does it mean when there exists negative item-scale correlations?

Answer

A

Check for errors (e.g., failure to reverse score)
The item may be poorly written (e.g., ambiguous)
Item was inappropriate for the current sample of respondents
If it is none of the above, go back to the conceptualization of the construct to consider if construct has been properly defined.

Question 14

Q

What is a caveat regarding alpha?

Answer

A

• Reliability indices such as Cronbach’s alpha does not tell us anything about the dimensionality (homogeneity) of the scale.
• Homogeneity of a scale refers to whether the scale items assess a single underlying construct.
• Alpha assumes homogeneity and checks on whether the items in a scale are sufficiently inter-related (internal consistency).

Question 15

Q

Concept of Dimensionality
What is dimensionality is concerned with?

Answer

A

• Dimensionality is concerned with homogeneity of items:
A set of items is homogeneous when responses to all the items is a function of the same psychological attribute.
Such a test is considered unidimensional because it appraises one and only one construct or trait.

• Establishing dimensionality of constructs is an important part of the scale development process.

Question 16

Q

How are scores calculated in a unidimensional test?

Answer

Study These Flashcards

A

• In unidimensional test, all the items are combined (through summation) to obtain a composite score.

Question 17

Q

How are scores calculated in a multidimensional test with correlated dimensions?

Answer

Study These Flashcards

A

Item 1, 2, 3 are homogeneous, and will be summed to form subscale A to measure attribute A.
Similarly, subscale B consists of item 4, 5, 6 which will be summed to provide its own score.
Subscale A & B could be combined to form a total score which measure a higher-order (or second-order) factor.
The second-order factor is interpreted in terms of the first-order factors (A & B) that load on it.
Each subscale score is evaluated independently for its psychometric property.
Total test score is similarly evaluated on its psychometric property.

Question 18

Q

How are scores calculated in a multidimensional test with uncorrelated dimensions?

Answer

Study These Flashcards

A

• Attribute A and B are independent and not reflecting any higher-order factors.
• No total test score can be computed because it is meaningless to combine two dimensions that are unrelated to each other.

Lecture 3 - Item Analysis Flashcards

(18 cards)