Chapters 2, 3, 4, Test Development and Psychometrics Flashcards
What are four basic types of measurement scales
nominal, ordinal. interval, and ratio
the most elementary of the measurement scales, involves classifying by name based on characteristics of the person or object being measured.
nominal scale
provides a measure of magnitude and, thus, it often provides more information than does a nominal scale.
ordinal scale
units are in equal intervals; thus, a difference of 5 points between 45-50 represents the same amount of change as the differences of 5 points between 85 and 90 points.
Interval scale
Has the same properties of an Interval scale without the existence of a meaningful zero.
ratio scale
an individual’s score is compared with scores of other individuals who have taken the same test
norm-referenced instrument
an individual’s score is compared with an established standard or criterion
criterion-referenced instrument
predetermined cutoff score indicates whether the person has attained and established level
Mastery component
converting scores into a _________helps understand how your score compares with the others who also took the test
Frequency distribution
often used assessment because this graphic representation makes the data easier to understand (x-axis -horizontal/y-axis vertical)
Frequency polygon
What are the three central tendencies that are useful in interpreting individual’s results on an instrument
Mode
Median
Mean
is the most frequent score in a distribution - highest frequency of any of the scores
Mode
the score at which 50% of the people had a score below it and 50% of the people had a score above it.
Median
The arithmetic average of the scores
Mean
Provides a measure of the spread of scores and indicates the variability between the highest and the lowest scores. _________ is calculated by simply subtracting the lowest score from the highest score.
Range
To avoid the problem of getting a zero- square the deviations, add these together and divide by the numbers of scores
Variance or mean square deviation
The square root of the variance is the ___________
standard deviation
scores on an instrument fall into a ______
normal distribution or a normal. curve (bell-shaped)
________is one in which the majority of scores are at the lower end of the scores.
positively skewed distribution
Where the majority of scores are on the higher end of the distribution
negatively skewed distribution
What are example of three standard Scores
z Score
T Scores
Stanines
What are the three major theories related to reliability?
classical test theory,
generalizability theory
item response theory
Based on the degree that there is an error within the instrument. _________suggests thatt every score has two hypothetical components: a true score and an error component
Classical test theory
_______means there is a system-methods are planned, orderly, and methodical
Systematic
means lack of system. -occurrences are presumed to be unsystematic (eg. not consistent, such as coffee spill on just one person’s test)
Random error
estimates range of
scores if someone to took test over and over again.
Standard error of measurement
When there is a long period between the administrations of the instrument
coefficient of stability
Measures consistency of how the test measures a particular construct within the group
Internal consistency
Helps measure internal consistency by splitting group in half and comparing results of each which should be consistent
Split-half Reliability
Measurement of consistency between
test scores for different test scorers.
Interscorer (interrater reliability)
Two forms given to same person and then tested for
reliability.
Parallel forms
Norming Groups. Created through:
*Simple random sample (every person in pop has same chance of being sampled)
*Stratified sample (used in assessments –where test developer match percentage or population
in terms of ethnic groups, geographic location, socioeconomic status)
*Cluster sample (create sample from groups – ie random sample of child achievement achieved through random selection of clusters – ie schools).
Consistency of a measure.
Reliability
Typically Pearson r – estimates a test
reliability – ranges from -1 to 1 – where 1 or -1 is a perfect relationship between the variables with no measurement error. The closer to 1 or -1, the better. 0 indicates there is no association between the variables.
Small association: .1 to .3 or -.1 to -.3
Medium association: .3 to .5 or -.3 to -.5
Strong association: .5 to 1.0 or -.5 to -1.0
Correlation NOT causation!
Reliability Coefficient
The extent to which a test measures what it’s meant to measure.
Validity
How well the test or assessment evaluates all aspects of the topic, construct or behaviour
Content-related variability
Extent to which test or instrument measures outcome (SAT as a predictor of college performance)
Criterion-related validity
Extent to which test or instrument measures or reflects the intended construct (such as intelligence).
Construct validity
Allows us to predict values for a response.
Regression
(Regression analysis used to estimate slope)