Development of maximum performance tests Flashcards
What seven steps are involved in test developments
- The construct of interest
- The measurement mode
- The objectives of the test
- The population
- The conceptual framework- The theoretical framework on which the test was based.
- The item response mode
- The administration mode
In what three ways can latent variables vary?
Scope
Content
Between educational and psychological variables
What three main objectives can tests have?
Description, diagnosis and decision making
What is the conceptual framework used for?
Conceptual distinctions and organise ideas
Name as many item writing guidelines ass you can think of (19 mentioned)
- Focus on one relevant aspect
- Use independent item content (except reading passage)
- Avoid overly specific and overly general content (leads to ambiguity)
- Avoid items that deliberately deceive
- keep vocal simple for population
- Put items vertically
- Minimize reading time and avoid unnecessary information
- Use correct language
- Use non-sensitive language
- Use a clear stem and include central idea in the stem
- Word items positively
- Write three options unless there are easy to write and plausible distractors
- One option must be unambiguously the right answer
- Place the item in alphabetical, logical or numerical order
- Vary the location of the correct options across the test
- Keep the options homogenous in length, content and grammar
- Avoid all of the above as last option
- Make distractors plausible
- Avoid giving clues to the right answer
What responses have to be rated by raters?
To free- or constructed response items
What item-rating guidelines are there? (9)
- Rate responses anonymously
- Rate the responses to one item at a Time ( all the responses to one item should be rated before moving to the next item)
- Provide the rater with a frame of reference- raters should be given instructions, schemes or ideal responses
- Separate irrelevant aspects from the relative performance
- Use more than one rater
- Re-rate the free responses
- Rate all the responses to an item on the same occasion
- Rearrange the order of the responses
- Read a sample of the responses - a sample of the responses should be read before the rating
What is meant by pilot studies?
Studies to test the quality of the concept items
What is the coefficient used for?
The consistency between different occasions for a rater
What formula does the coefficient kappa use?
πΎππππ= (πβπΈ/1βπΈ) (πΈ<1)
E refers to the expected proportion of identical ratings and O refers to the observed proportion of identical ratings. E can be calculated by multiplying the marginal proportions.
How is O calculated?
Adding the diagonal cells
In what way are the pilot raters studied?
In consistency and the agreement between raters
What can pilot studies reveal?
- Many raters are inconsistent with their ratings of many items (procedure is inadequate)
- Some of the raters are inconsistent (do not understand instructions- remove raters)
- Some of items are inconsistent- remove items
- Disagree on many items- procedure inadequate
What kind of questions should be placed at the start of the test and why?
Easier items because people are often nervous and should build confidence