week two - planning a measure & writing good items Flashcards
questionnaires
Knowledge-based questionnaires: examining ability, aptitude, or achievement
Person-based questionnaires examining personality,
clinical symptoms, or attitudes
- series of individual items
- other names (survey, measure, inventory)
step 1 for test construction
purpose and conceptual foundation
- what is the test for
- why does the test need to exist
- what is the purpose of the test
- how will the material relate to the test
WHY measure is being made
WHAT it is measuring
WHO is the target population
domain of content
there should be a meaningful and logical connection between the test and the items
step 2 of test construction
Table of specification
- blueprint of our test
- what content should we cover
- grid or table where content areas are found along horizontal axis
content analysis
brainstorming questions and topic areas
- cover everything that is relevant to purpose of questionnaire
ex . observation of extreme behaviours can identify ways for psychologists to ask individuals in school settings
manifestations
should reflect the domain of interest
- behavioural. cognitive, affective
- should take into accoutn response behaviours
table of specification
rows and columns form what is called a matrix
each row by column crosses a cell - total number of cells are the number of rows multiplied by number of columns
weighting
decisions whether to give different weightings to the cells
ex. some cells may warrant having more items than others or a specific content area may be deemed more important
number of items
- consider factors such as size of blueprint
- amount of time available
- min 3 items to run any analyses
time and reliability
reliability is impacted by time
- time limits can impact reliability and too long a time given will reduce reliability
- min of 12 items for adequate reliability
ensure your questionnaire is long enough to improve reliability but short enough that your population of interest can complete it in a reasonable
time frame
pilot versions
include at least 50% more questions in pilot version than final
- pilot = longer
- pilot test questionnaire first
number of items
work out how many items to write for each cell
- multiple percentages in a row/column by the total number of items
step 3 of test construction
Select Population of Interest
- identify individuals who should be targeted
- may be straightforward
- consultation of subject matter or community experts
sampling: selection of elements following prescribed rules from defined population
population: collection of elements sharing a defining characteristic
ELEMTS ARE TEST TAKERS
4 things to keep in mind when sampling
- who should the sample consist of
- how credible is this group as being a representative of pop. of interest
- what obstacles may we encounter when obtaining our sample
- how can we address or avoid pre mentioned obstacles
2 sampling methods
non-probabilistic sampling: individuals are selected based on some criteria - there is no defined probability of selecting a person
ex. student volunteers agreeing to take a test
probabilistic sampling: each person has a nonzero change of being selected and the selection process is random
step 4 of test construction
Item Construction
- choice of item format, scale of item
- diff scales have diff purposes
- format = scale of the item
scaling
process of transforming and modifying the mathematical properties of an item
Alternate- choice/ Dichotomous
only 2 response options
True/False
advantages: simple, ease of administration, ease of scoring
- requires 100% absolute judgement
- used in knowledge based questions
- personality questions
Multiple-Choice /
Polytomous
- more than 2 response options
ex. multiple choice tests
consists of two parts
1. the step - statement or question that contains the problem
2. options - list of possible responses, one correct other distractors
- good balance
- take more time and skill (disadvantage)
Wrong answers on a polytomous item are called
distractors
- largest difficulty on tests
- effects of guessing are reduced by distractors
rating scale items
responses are on an ordinal scale along a continuum
- responses are marked
ex. I am not a superstitious person:
A.Strongly Disagree
B.Disagree
C.Agree
D.Strongly Agree
likert Scales
measure the degree of agreement a person has with a question
- these are ordinal
ex. Not at All, Not Often, Neutral, Often, Very Often
advantages and disadvantages of rating scale items
advantages:
- capture a wider range and more precise measurement
dis:
- tend to be dealing with response behaviours
(ex. selecting strongly agree or picking neutral answers)
categorical formatted items
rating scale formats for larger continuums
(typically 7-10)
- used to discriminate more finely bw individuals
- hard to do with 10+ responses
ex. on a scale from 1-10 how hungry are you
Psychometric theory indicates
less than four response
options will reduce the reliability of the item and seven response options is the point were reliability begins to diminish
adjective checklist
commonly used in personality measurement
- list of adjectives are provided
- see if patterns are found
- not used much anymore
Q sort
choices are listed on cards and individuals place cards in piles
- anywhere from 2-10 piles
- piles based on degree of how much they agree
13 Item Guidelines
1) Match your blueprint
2) Write all items clearly and simply
3) Avoid irrelevant material , keep the options short.
4) ask only one question or make only one statement. -avoids what is called doublebarrelled items.
5) Generate a pool of items so you have multiple choices
to pull from.
6) Avoid statements written in the past tense.
7) Avoid using words with absolutes, such as only, just,
always, or none.
8) Where possible, avoid subjective words such as
“frequently”, as these may be interpreted differently by
different respondents
9) It is important that all options function as feasible
responses
10) Avoid statements that would be selected would be
selected by everyone.
11) Keep the reading level of the test to be appropriate
for individuals who will be taking the test.
12) Items should be sensitive to ethnic and cultural
differences
13) Having correct spelling and grammar is essential
3 tips for knowledge based questionnaires
1) Make sure that alternate-choice items can undoubtedly be
classified as true or false; otherwise some respondents will think of an exception to the rule.
2) For multiple-choice items, ensure that each item has only one correct or best response.
3) Each distractor option should be used equally by
respondents who do not choose the correct response.
common problems in MC writing
- unfocused stem
- negative stem
- window dressing
- unequal option length
- negative options
- clues to correct answer
person based questionnaires
Acquiescence - the tendency to agree with items regardless
of their content
Social Desirability - the the tendency to respond to an item in a socially acceptable manner
Indecisiveness - Indecisiveness is the tendency to use the
“don’t know” or “uncertain” response option (solution: omitt middle category)
Extreme Response - Extreme response is the tendency to
choose an extreme option regardless of direction (solution: use clear, specifc items)
Item Order strategies
- knowledge based - order items in increased difficulty
- order items based on content areas
- randomly place items - make adjustements to ensure given content area does not occur too often
what to include in a questionnaire
- background information (age, name, gender)
- instructions (how to respond)
- ensure layout is accurate to what test being used
- pilot the questionnaire
Scoring
allocate a score to each response option
- knowledge based: out of 1
higher the score better the performance - person based: scores should be on continious scale
ex. , setting always =
5, usually = 4, occasionally = 3, hardly ever = 2, and never =1
reverse coding
negative worded/negative dimensions be turned into a positive
Example: Positive items are score as: always = 5, usually = 4, occasionally = 3, hardly ever = 2, and never = 1.
* Then reverse scoring the negative items would be: always =1, usually = 2, occasionally = 3, hardly ever - 4 and never = 5