Item analysis Flashcards
What is item analysis?
A general term used to describe a set of methods used to evaluate test items.
- Item analysis helps us to decide what items to include in our measure.
- The basic methods include item difficulty and item discriminability.
What are the different methods of conducting item analysis?
- Item difficulty
- Item discriminability
- Item characteristic curves (ICCs)
- Item Response Theory (IRT)
- Criterion-referenced tests.
What is item difficulty?
- The proportion of people who get a particular item correct.
- The higher the item difficulty value; the easier the item.
- The formula is p= number of people who answered the item correctly/ number of people taking the measure.
- It is also referred to as the facility index.
- Item difficulty ranges between zero and 1.
- Ideally we want p values that fall within the 0.3 to 0.7 range. Higher than 0.7 is too easy and lower than 0.3 is too difficult.
What is facility and the facility index?
- An item with good Item facility is one for which different respondents give different answers.
- The facility index gives an indication of the extent to which respondents answer an item in the same way.
What affects item difficulty?
- The format of the test
- The number of test items.
- Item difficulty is more applicable in settings where there is a clear correct and incorrect answer.
What is the optimum difficulty level (ODL)?
- Between 0.30 and 0.70
- Calculate optimum difficulty: (1-chance)/(2+chance). Essentially halfway between 100% getting the item correct and the level of success estimated by guessing.
- The ODL for the dichotomous format does not fall within the 0.3-0.7 range.
What are some key things to note about the ODL?
- We want most of the items to be around the ODL and few at the extremes of this range.
- The distribution of p-values should be approximately normal in MCQs.
- We need a range to discriminate between stronger and struggling students.
- The facility index of the item tells us nothing about its intrinsic characteristics. Its value is related to the sample. Different sample yield different results: item difficulty is sample dependent.
What are the exceptions for having items be within the ODL range?
- At times we need more difficult items (e.g. selection process)
- At times we need more easier items (e.g. special education)
- At times we need to consider other factors (e.g. boost morale)
What is item discriminability?
- Assessment of item discriminability determines whether the people who have done well on particular items have also done well on the whole test.
- It can be assessed using different methods: The extreme group method and the point biserial method.
What is the discrimination index?
- Higher values indicate better discriminability.
- Good item discriminability is when people who do well on the test overall get the item correct and vice versa.
What is the extreme groups method?
- This method compares those who have done well with those who have done poorly on a test
- Calculated by looking at the number of people in the upper quartile who got the item correct divided by the number of people in the lower quartile who got the item correct; this is referred to as the discrimination index.
- d(i)= U/N(u)- L/N(l)
- 0.4 is the baseline for item discriminability. If an item is lower than 0.4 then it doesn’t have good discriminability.
What is the point-biserial method?
- It is also known as item-total correlation.
- Good items are those items for which students who pass the item do well on the overall test. And conversely, students who fail the items should do badly on the overall test.
- If a student fails the item but does well on the overall test, the item-total correlation will be negative.
- The rule here is also 0.4.
- Item discriminability can be used for the Likert test.
- The closer the number is to one, the better (same for extreme groups)
What are the steps for calculating the point-biserial correlation?
- Find the mean score and SD for all test takers.
- Find the mean test score for those who got e.g. item 1 correct only.
- Subtract this from the total mean and divide by the SD.
- not relevant babes.
What else do you need to know about the point-biserial correlation?
- Item correlations can also be used for Likert-type test items, category format. Good items here would be those that have a positive item total correlation.
-E.g. If an item on a questionnaire measuring schizophrenia symptoms has a high correlation with total scores on the overall questionnaire, then the item is good at measuring schizophrenia symptoms - Can use this as an indicator of whether or not to include an item in a test/questionnaire. (include items with a higher correlation and exclude those with a lower one).
What is an tiem characteristic curve?
- The relationship between performance on an item and performance of the overall test tells us how well the item is tapping into what we want to measure.
- They are a graphical display of item functioning.
- The total test score is plotted on the x-axis.
- The proportion of people getting the item correct is plotted on the y-axis.