Psychometrics Flashcards
What is psychometrics?
Branch of psychology that deals with the design, administration and interpretation of quantitative tests for the measurement of psychological variables (intelligence, personality etc)
What are the different ways in which a test score in psychometrics can be interpreted
- cut off of scores: passing or not reaching the cut off score
- can be tested repeatedly to see if there is a change in scores
- can be compared to many others to see if they are average, above average etc
What is a norm in psychometrics?
refer to the performances by defined groups on particular tests
What does it mean to standardise a test?
The process of administering a test to a representative sample for the purpose of establishing norms
What are age norms
Average performance of different samples of testtakers who were at various ages at the time of testing
What are grade (educational) norms?
Average performance of a person who has completed a specific level of education
What is a subgroup norm
Derived from groups of people identified by specific factors
Eg. Differences in ethnic groups
What are some things to think about when testing
Is the test appropriate
Is it a good test
Are you using it appropriately
Is the person administering it appropriately
Are they qualified or trained
Is it ethical
Have all details been appropriately recorded and have the results been clearly explain to the person
What does being culturally fair mean
No systematic differences in the way people from different cultures interpret the test
there might be a different typical score for people of a given culture So it is Important to gather normative data from different cultures
What are the steps in creating a test or questionnaire
Defining the test
Selecting a scaling method
Constructing the items
Testing the items
Revising the test
Publishing the test
What kind of things are included with defining the test
Using existing theory as a guide where is possible
Need to develop clear ideas of the attributes
The development is time consuming and costly so it is good to check if there has been one developed already
Kaufman and Kaufman model of test definition process?
Measure attribute/construct from a strong theoretical and research basis
Must have the capacity to distinguish between different attributes
Yield scores that are translatable to an intervention
Include novel tasks or questions
Be easy to administer and objective to score
Be sensitive to the diverse needs of the groups being assessed
What is categorical data
Numbers collected into groups or categories
Things like gender age and political party
What is numerical data
Numerical data is measurable
Can be divided into discrete or continuous data
What is discrete data
Whole numbers.
Number of children
Assignment mark
What is continuous data?
Data that consistently changes with no end point
What are the different levels of measurement?
Nominal: named variables
Ordinal: named and ordered variables
Interval: named, ordered and proportionate interval between variables
Ratio: named, ordered, proportionate interval between variable and can accomodate exact 0
What are floor effects?
Everyone says no to a particular question
What are ceiling effects?
Everyone says yes to a particular question.
What is item format with constructing the items
Item format should be related to scaling method of choice with dozens of choices available.
Multiple-choice true or false false choice or likert scale
Multiple choice?
They are easy to administer and complete. Common for student exams to test knowledge or ability they are sometimes used for psychological tests
However it is difficult to construct items and they provide cues for the correct response
True/false questions?
Often used in personality tests. Easy to understand and answer.
However, answers may reflect social desirability more than personality traits and it doesn’t permit much variability
Forced-choice methodology?
Often used in personality test overcomes the problems of true or false questions in social desirability
However sometimes people don’t fit into either category
Likert scales?
Can account better for individual differences good for assessing attributes and perceptions and reduces social desirability bias
However assumes the strength and intensity of an attitude is linear and people don’t always fit into specified options social desirability can still occur
How to go about testing the items?
You should conduct a pilot study 1st to ensure the items are clear and can easily be answered if not they can be refined
Revising the test?
Using the new developed test collect data in a new Sample
Repeat previous steps in testing the items
Make necessary refinements
Cross validate
Obtain feedback from participants
Percentiles?
When we have a group of scores we can work out where one particular score fits into a distribution
What is validity
Reflects a tests ability to access the construct it was designed to measure
What are the different types of validity
Content construct and criterion related
What is content validity?
Content validity means the test measures appropriate content
What is construct validity?
means the test measures the skills/abilities that should be measured.
What is factor analysis?
Statistical technique to determine the pattern of correlations or variability amongst the items.
What is criterion validity?
Is the extent to which the test predicts or is related to an outcome
What is reliability?
Concerns measurement consistency or the ability of a test to produce consistent results
What are the two types of reliability?
Internal and external
What is internal reliability?
Concerns the extent to which a measure is consistent within itself
What is external reliability?
Concerns the extent to which a measure varies from one use to another
Test-retest reliability: stability over time
Inter-rater reliability: the degree to which different eaters give consistent estimates of the same behaviour