Week 4 Test Development Flashcards
Define test conceptualisation
Starts with a question.
1. Review the literature - relevant theories/ constructs, definition, parameter setting
- Obtain a clear, theory-informed conceptualisation and definition of the target construct
- Develop an initial item pool
Define test construction
.
Define tryout
Administering the test on a representative sample using standardised instructions.
Data from the test tryout used to narrow down number of items
What are 6 writing guidelines in test construction?
-
-
-
-
-
-
What are 8 writing guidelines in test construction?
-
-
-
-
-
-
-
-
What are the four types of scales used in psychological test construction?
.
What are the pros/cons of Likert scales?
Pros - - - -
-
What are the two types of response formats in test construction?
- Likert
- B. C S
What are the pros/cons of Likert scales?
Pros - - - -
-
What are the pros/cons of binary choice scales?
Pros - - - -
-
-
What are the 4 types of response formats in test construction?
- Likert
- B. C S
- P. C
- C. S
- E or W
What are the pros/cons of binary choice scales?
Pros - - - -
-
-
What are the pros/cons of essay/ written format tests?
-
-
cons - - - - -
What are the 3 benefits of having an expert review an initial item pool before administering to the target sample?
- confirm/invalidate definition of construct by asking…
- evaluate the items c___ and c___
- identify other
Articulate the criteria that assess whether an item is a ‘good item’
good test items = good tests
achieved through item analysis.
criteria:
- reliability
- validity
- discriminates at different level of trait/ability
Articulate the criteria that assess whether an item is a ‘good item’
good test items = good tests
achieved through item analysis.
criteria:
- reliability
- validity
- discriminates at different level of trait/ability
What are the item properties which you may investigate when trying to assess whether an item is a ‘good item’?
- item difficulty/ distribution
- ## --
what is the formula for item difficulty index?
what does a high index mean?
What must also be considered when looking at optimal item difficulty index?
formula:
item difficulty index = examines who answered correctly/ total number of examinees
index= range between 0 & 1
high index =
consideration: probability of ___
Binary choice scales (BCS)(true/false) have a high probability of one guessing the item correctly (0.5 out of 1 on the item difficulty-index).
a) What is an optimal item-difficulty index for BCS?
a) 0.75 out of 1 on the item-difficulty index. It is better to set the optimal item-difficulty index higher because more people w4
Binary choice scales (BCS)(true/false) have a high probability of one guessing the item correctly (0.5 out of 1 on the item difficulty-index).
a) What is an optimal item-difficulty index for BCS?
A Multiple Choice Question test with 4 items has a item difficulty index of 0.25/ 1.
b) What is an optimal item-difficulty index for multiple choice questions?
a) 0.75 out of 1 on the item-difficulty index. It is better to set the optimal item-difficulty index higher because test takers have a high chance (0.5) of guessing correctly.
b) 0.625. It is lower because MCQ takers have a lower probability of guessing correctly (0.25/1)
In item analysis, there is an item property called dimensionality (factor analysis). Why would you use it?
Having a set of items doesn’t mean you have a scale
- items may not have underlying variable OR
- they may have multiple underlying variables
What can factor analysis help with?
- determining the # of underlying latent variables or constructs
- help condense information
- define content or meaning of factors
- identify that are performing better/worse e.g. items which don’t fit into any factor/fit into multiple –> elimination
What are the factor analysis decisions
number of factors to extract
- Eigenvalues (>1)
- Scree plot
Rotation
- helps interpret the data
- oblique: assumes factors are correlated
- Orthogonal: assumes factors uncorrelated
How is item reliability measured?
measures internal consistency of test
How is item reliability measured?
measures internal consistency of test
- item-discrimination index
- item-characteristic curve
When a test developer is deciding to retain or delete an item, is poor performance on ONE task/ aspect of item analysis sufficient enough to delete the item?
no
What is an indication of item with good a good item-characteristic curve?
e.g. individuals with a low ability score lowly, moderate ability score in the middle range and those with a high ability score highly on a test.
In the final stage of test development, test revision, one must undergo cross validation. what is cross validation?
administering revised test in another population to determine its applicability for this population.
it is a test of validity. sometimes validity shrinkage occurs