Week 9 Flashcards
What’re psycho metrics?
Psychometric is a branch of psychology that deals with the design, administration, and interpretation of quantitive tests for the measurement of psychological variables such as intelligence, aptitude, and personality traits.
- normally starts with a theoretical construct (eg a theory of personality)
- a test can then be designed that should measure the construct of interest. Then we can use statistics to check that the test does mea
Steps in test/questionnaire construction
- Defining the test
- Selecting a scaling method
- Constructing the items
- Testing the items
- Revising the test
- Publishing the test
Describe the first part of developing a test: 1. Defining a test?
What is a test/questionnaire?
- a set of items that allows measurement of some attribute of an individual:
- problems for which correct answers must be found
- questions about behaviours, feelings or thoughts.
- questions about attitudes, preferences
What is an item?
- generic work for the various forms of content in a psychological test or questionnaire (eg a question, if there’s a 50 word test, it’s a 50 item test):
- permits measurement of an attribute
- carefully selected (not random)
What is it you are seeking to measure?
- develop clear idea or specification of the attribute
- use existing theory as a guide where possible
- write a document containing specifications for the development of items. It will include: clear definition of the attribute, outcome of a literature search of central theoretical claims and findings, if more than one attribute is to be measured a separate specification is needed for each.
Has there already
Kaufman and Kaufman model of the test definition process
- Measure attribute/construct from a strong theoretical and research basis
- Must have capacity to distinguish between different attributes
- Yield scores that are translatable to an intervention
- Include novel tasks or questions
- Be easy to administer and objective to score
- Be sensitive to the diverse needs of the groups being assessed
Number two of creating a test/questionnaire: 2. Selecting a scaling method
Measurement - the assignment of numbers to objects according to a set of rules
Types of data
Four fold classification system (Stevens, 1951)
1. Nominal: just a group you put someone in - it is categorical. You can assign number based on the group the person belongs, but the actual numerical value assigned to each group is meaningless (Dog person = group 1)
- Ordinal - still categorical, but in a ranking order - eg. 1st place, 2nd place in a race. Tells us the order, but we cannot tell the distance between each point
- Interval - this is where you can obtain continuous data. Eg temperature (in Celsius). Equal distance between points, but there is not a true 0 point (eg 20 degrees isn’t twice as hot as 10 degrees). Also, people can provide responses according to an ordered response option scale. This is often also referred to as a Likert- Type scale (eg 1=strongly agree, 2=agree, 3= disagree, 4= strongly disagree). (Assume that they are equal distances apart).
- Ratio - this is a continuous measurement with a starting point of zero. Differences between points on the scale are meaningful (eg weight, length). Ratio scales are rare in psychological measurement. Is there any meaning zero in psychology? Not really.
Step three of creating a test: 3. Constructing the items
Initial questions in test and questionnaire construction:
- homogeneity vs. heterogeneity:
Does it allow for attributes to be reliably measured (homogeneity)?
Does it allow for adequate differentiation of people (heterogeneity)?
Expect to construct excess items in the beginning, because not all items will be deemed suitable
Table of specifications:
Details the information and tasks the person is being assessed on eg. Content-by-process matrix
Multiple choice format
Common for students exams/tests to test knowledge or ability
Sometimes used for psychological tests (eg reading the eyes in mind)
Usually one correct answer
Permits quick scoring
Problems:
Difficult to construct items (eg need good distraction items)
Provides cues for the correct response (don’t assess free recall)
True/false questions
Often used in personality tests
Easy to understand and answer
Problems:
answers may reflect social desirability
Doesn’t permit much variability
Forced choice methodology
Often used in personality tests
Overcomes the problems of true/false questions in social desirability
Problems: people don’t always fit in either category (same problems as true/false)
Likert- Type scales
- one of the most used types of response formats
- can better account for individual differences (many shades of answers)
- good for assessing attitudes and perceptions
- reduces social desirability bias
Problems:
- consistently measuring the construct in question?
- are all of the items appropriate and contributing to the overall interpretation of the test?
- assumes strength/intensity of an attitude is linear
- people don’t always fit into specified options
- social desirability can still occur
Tips for writing good test/questionnaire items
- questions should be simple and to the point
- use of words with a clear meaning (eg most vs. the majority of)
- avoid double barrelled questions (eg is your lecturer easy to listen to and has a good sense of humour)
- offer an ‘out’ for questions that domt apply (eg middle response option in a Likert scale)
- avoid offering too few or too many options
- don’t ask questions that are too hard to remember (more common events =shorter window of recall)
Step four of a test: 4. Testing the items
- Conduct a pilot study (small exploratory study) to first ensure the items are clear and can easily be answered (refine items if needed)
- Administer questionnaire to a large participant sample m
- Do some number crunching in special statistical software
- Investigate psychometric properties for the individual items
- item characteristics (eg difficulty levels)
- create statistically sound subscales (eg factor analysis)
- throw out non-performing items - Determine reliability and validity for the subscales / overall test
Step five of developing a test: 5. Revising the test
- using the new developed test/questionnaire collect data in a new sample
- Repeat previous steps (from testing the items)
- Make necessary refinements
- Cross- validate (does the test perform just as well in new sample?)
- Obtain the feedback
Testing the items and revising the test may go around and around and around
Step six of developing a test: 6. Publishing the test
Produce testing materials
Develop a technical and users manual
- background information
- development history
- administration instructions
- reliability
- validity
- normative information
Publish a scientific paper
Standardisation group
Once we have an individuals score we want to know where that score fits in comparison with the individuals peers
Large groups of people are tested and their scores are used to work out test norms
-we can use the mean and standard deviation of this to work out where an individual sits in comparison to others
Depending on the
Score to percentile conversions
Percentiles: when we have a group of scores, we can also work out where a score fits in a distribution
This can be done for specific sub categories as well