Ch3 - Test Score Interpretation Flashcards
Raw score
A number that summarizes an aspect of a person’s performance on a test
• No meaning by itself - it’s impossible to interpret a score without a frame of reference (is high a good or bad result?) - and even then we can be mislead
Norms
test performance of 1+ reference groups
○ Norm-referenced test interpretation uses standards based on the performance of specific groups
○ Useful to compare individuals with one another
Normative sample
the groups we use to establish norms
• Performance criteria
○ Criterion-referenced interpretation: makes use of procedures designed to asses whether and to what extent the desired performance criteria have been met
Norm-Referenced Test Interpretation
Score is used to place the test taker’s performance within a pre-existing distribution and compare it
Developmental norms
Ordinal Scales Based on Behavioural Sequences
• The sequence of development can be used as an ordinal scale
• Frame comes from observing/noting uniformities in the order/timing of behavioural attainments across many individuals
Ex:
• Provence Birth-to-Three Developmental Profile: Example of developmental norm using ordinal scale
○ Information about the timelines with which a child attains developmental milestones in relation to their age in 8 domains, for various categories of ages
○ Scores are added to create a performance age, compared with the chronological age
Theory-Based Ordinal Scales
The ordinal scales are based on some other factors than age
Example: Ordinal Scales of Psych Development
○ Based on Piaget’s delineation of the order in which cognitive competencies are acquired during infancy / childhood
age equivalent scores (AKA test ages or test-ages equivalents)
○ A way of comparing the test taker’s performance on a test with the average performance of the normative age group with which it corresponds
§ Ex: a child’s raw score = the raw score of 9 years old in the normative group
○ Problematic because development varies within age groups
○ Has LOTS of limitations - not much used in psych because of that How does it work • Ex: test, with grades ranging from easy, to harder ○ The same test is administered to children in a range of grades (grade 2 to 6) ○ Expectation: younger kids will get less far than older ones ○ *ONLY the means are recorded, not the SD § Does not take into account the range of grade distributions - major flaw ○ The means increase for each grade ○ In other schools, all those who rate higher than 15 have a raw score equivalent of 2.0 (because that is where grade 2 students graded at the start of the year) ○ Grade equivalent scores are established through interpolation § Between 15 and 25, there are 10 raw score points § Between 2.0 and 3.0, there are 10 grade units § If someone scores 17, their grade equivalent score will be 2.2
Grade Equivalent Scores
Another way of interpreting developmental norms - made possible by the uniformity of the school curriculum
derived by locating the performance of test takers within the norms of the students at each grade level in the standardization sample
○ Ex: a child has scored in 5th grade in English (does NOT mean that he knows 5th grade English) and in 3rd grade in maths
• Can also be misleading ○ Curriculums still vary ○ The advance expected between grades varies ○ Not all children will attain their grade scores and its ok
Within-Group norms
Compare one’s score to the performance of one or more reference groups
The Normative Sample Requirements
• Should be representative of the kinds of individuals for whom the tests are intended
• Needs to be sufficiently large, to ensure the stability of the values obtained
○ Tests that require specialized samples may have smaller samples
• Needs to be recent
Standardization sample
group on whom the test is originally standardized in terms of administration /scoring procedures, and establishment of norms
Reference group
Any group of people against which test scores are compared
Subgroup Norms
A large sample can be further divided into smaller subgroups (age, gender, etc) for which norms can be established
Local Norms
• Reference groups drawn from a specific geographic/institutional setting
Convenience Norms
• Norms based on people who were available at the time of testing
Percentile score (disadvantages)
relative position of a test-taker compared to the reference group
○ Most test-takers understand them easily
○ Raw scores can easily be compared with percentile ranks
• Disavantages: ○ In a normative sample, there is a lowest and a highest score - those can be said to be the 0th and 100th percentile, but this is impossible to narrow down when we interpret the scores of a larger population ○ The fact that scores are clustered in the middle and extended at the end changes the perception of those scores in percentiles
Test Ceiling and Test Floor
- Test ceiling: highest score attainable on an already standardized test - someone reaching it means that the test might be too easy (insufficient ceiling)
- Test floor: if a person fails all the items or scores lower than anyone in the normative sample, the test might be too hard (insufficient floor)
Linear transformation
changes the units in which scores are expressed while leaving the interrelationships among them unaltered
○ The shape of a linearly derived scale score distribution is the same as that of the original score distribution
1- convert raw scores in z scores: indicates relative position of a score within a distribution *The value of a Z score represents the original score's distance from the mean in ST DEV units