Test Development Flashcards
What are the first six steps to developing a measure?
1) overall goal and pre-planning
2) content definition
3) test specifications
4) item development
5) test design and assembly
6) test production
What are the last six steps to developing a measure?
7) test administration
8) scoring responses
9) establishing passing scores
10) reporting results
11) item banking
12) test technical report
What is the first step for developing a measure and what does it aim to do?
overall goal and pre-planning: provides a systematic framework for the project
what happens during the overall goal and pre-planning stage of developing a measure?
(1) what is the aim or purpose of the test
(2) what construct will be measured
(3) what will the test format be
(4) how will it be administered. decide on a timeline for developing the test, how we will check the quality of the test, who will produce the test or publish or print it.
What is an aim?
In the first step we have to be clear to specify the aim of the test/measure - specify what you want to achieve with the test.
what are the different assessments?
screening or in-depth assessment
describe screening tests
include fewer items and cover less content, quick test, easy to administer.
describe in-depth assessments
more items, cover more content of the construct. A detailed test is generally more reliable, but can be much more time-consuming. Often detailed tests may also need special training on how to administer and interpret the test.
what are the modes for interpretating outcomes?
Normative, Ipsative and Criterion-referenced
what is the normative mode of interpretation?
compares scores to a norm group, score is compared to the average score of the rest of the sample of test-takers.
what is the Ipsative mode of interpretation?
test that makes it possible to compare different aspects - so a test that allows for intra-individual comparisons
what is the criterion-referenced mode of interpretation?
the performance is compared to a pre-defined status - what constitutes a ‘pass’ or ‘fail’
what is the second step of developing a measure and what does it aim to do?
content definition: aims operationally define the construct you are measuring
what does the word ‘operationalization’ mean?
refers to the act of defining or making a ‘fuzzy’ concept measurable. Define what you mean by the concept you are measuring. There are many ways in which you can define a specific concept, eg; The construct of intelligence.
What is the second part content definition when developing a measure (step two)?
defining the purpose of the measure - what will you use the scores on the test for?
What is step three of developing a measure and what does it aim to do?
test specifications: the test blueprint
what are some factors in test specification?
1) test/response format
2) item format
3) test length
4) content areas of the constructs tested
5) whether items will contain visual stimuli
6) how test scores will be interpreted
7) time limits
what are the types of response formats?
selected response, constructed response, performance response
define the selective response format
A selected response, is something like a questionnaire with a Likert scale where a test-taker must choose to what extent they agree or disagree with a statement, selected one of the options, hence selected response.
define the constructive response format
A constructed response is one where the test-taker must construct the answer, and generate it from their own knowledge. eg; where you are expected to respond by writing an essay, or filling-in-the-blank
define the performance response format
a performance response is one where the test-taker needs to perform a task, such as where you are asked to build a puzzle, or build a design.
what are the types of formats for response formats?
objective or subjective
what is an objective response format?
objective response format would be very structured, where you pick one response – there is a right or wrong, or a definite answer such as with an MCQ test or a Likert scale for a questionnaire assessing personality (pick to what extent something resonates with you/applies to you)
what is a subjective response format?
subjective format would be where the interpretation of the response depends on the examiner’s judgement, so more unstructured.
give examples of psychological tests that use subjective response formats
Rorschach or TAT.
what are the types of item formats?
open-ended items, forced-choice items, sentence-completion items, performance-based items
define an open-ended item format
where there are no limitations on the test-taker.
define a forced-choice item format
limited to how you can answer. You have to select within the options you are given
define a sentence-completion item format
asked to complete a sentence - some open-endedness
define a performance-based item format
items require you to perform a task, like write an essay or present an oral
what is important when considering test length?
the amount of administration time available, what the purpose of the measure is, test fatigue, enough items to measure in enough detail
what does ‘test content areas’ refer to?
ensure that all domains of the construct are tested
how do we ensure all domains of a construct are tested?
a test structure (blueprint) outlining content areas (different categories/subcategories of the construct) and manifestations (How does the construct manifest)
content areas in table column, manifestations in table rows.
what is the fourth step for developing a measure and what does it aim to do?
item development: involves setting the items in the test
what are the guidelines for item development?
1) use clear wording to avoid ambiguity
2) use appropriate vocabulary
3) avoid double negatives
4) if writing a test for children consider a different format
5) don’t make questions too obvious
what is the fifth step for developing a measure and what does it aim to do?
Test design and assembly: assembling the psychological test
what things need to be considered when assembling/designing a test?
1) placement of correct items
2) check for errors
3) manual or computer assembly
4) how are people going to answer the test?
5) does the test look aesthetically pleasing?
what is pre-testing?
where you administer the test to a representative sample from the target population to see how well it is working or if it is working in the way that you want it to.
what does pre-testing aim to do?
Running a pilot assessment in this way, can tell us also whether people answer how we expect them to, if there are any problems or issues with the test, it the items are too easy or difficult, whether the time limit set for the test is enough.
what is the sixth step for developing a measure and what does it aim to do?
test production: everything that goes into putting together the final version of the test, which includes finalizing which items are included, the sequence they are presented in, and the necessary visual stimuli such as images you’d like to include - It also involves quality control
what is the seventh step for developing a measure and what does it aim to do?
test administration: Test administration is the most public and visible aspect of testing, when you administer your test to a sample.
what are important factors of test administration?
security and standardization of testing conditions
name methods for standardizing test administration
control extraneous variables, make conditions identical for all examinees, standard instructions, same time limits
why is standardizing test administration important?
comparability between samples, fairness when interpreting test scores, increases validity of findings, increases reliability of findings
what are steps eight and nine in developing a measure and what do they entail?
scoring responses and establishing a passing score: deciding how we will score a test or questionnaire
what aspects go into scoring criteria for a psychological test?
developing a scoring key, decide when we drop participants from the study sample, investigate whether there is a response bias, evaluate how good or bad items are through item analysis
what is item analysis?
analyze whether the test items are good or bad to decide whether to keep or omit their data
what are the considerations when doing item analysis?
whether the items are congruent with the test objective, and whether they are valid – so questions surrounding validity, questions surrounding reliability, poor performing items that need to be discarded, which items are easy/difficult, how long it takes an examinee to complete each item
what is step nine in developing a measure and what does it aim to do?
establishing passing scores: set up norms – what the distribution of scores look like in our target population, set performance standards, or decide on cut-off scores
what do you have to achieve before establishing scores can take place?
ensure the test is reliable and valid
what is step ten in developing a measure and what does it entail?
reporting results: all examinees have a right to accurate, timely and useful reports of their performance, in understandable language.
what is step eleven in developing a measure and what does it entail?
Item banking: refers to storing items for future use.
NB! security
what is step twelve in developing a measure, and what does it aim to do?
publishing and refinement: a technical report detailing the how one would go about administering the test and scoring it and much of the information detailed in the previous steps. You would also then submit the measure to be classified as a psychological test or not and sort out publishing and marketing should you want to charge for it
what factors need to be considered when developing a multi-cultural test?
education level and language
why is it important to develop multi-cultural tests?
South African samples are often multi-cultural. If you are targeting South Africans in general, you then have to be especially mindful during the planning phase of several things that might influence performance on tests.