Scale development Flashcards
difference between reliability and validity of scales
reliability is a measure of consistency (in terms of a construct measuring the same thing in a sample), due to a latent variable exerting an effect on the variation of scores on a scale
while validity is concerns whether the latent variable of interest is actually being measured
how could you assess the reliability of a scale
split half
internal consistency
test re test
how could you assess the validity of a scale
content
construct
predictive
guidelines for scale development
De vellis:
- define the latent variable
- generate an item pool
- review the items for content
- administer the items to a development sample
- evaluate the items (reverse code, mean and SD, skewness and kurtosis)
- compute coefficient alpha
- validate the scale - correlate it with other constructs / does it predict anything
kurtosis
spikey/ peak-ness of the distribution
skewedness
postively (left) i.e. lots of low scores
negtively (right) i.e. lots of high scores
if z > 3.08
equates to p
a reliable measure should have
inter-correlations between its items (caused by the same latent variable)
because variance in the items are hopefully due to the underlying construct
split half reliability
when scores on the first half of a scale (e.g. sensation seeking) are correlated with scores on the second half
e.g. where scores on the odd items are correlated with scores on the even items
internal consistency
gives a much better idea of consistency than split half - why ??
alpha coeffcient indicates the proportion of the variance in the scale scores that is attributable to the true score
- based on correlations between each scale item score and the total score
- provides a more accurate measure of internal reliability than a split half correlation!
alpha above .90
can shorten the scale
redundancy
caveats of alpha
dependent on
1) magnitude of correlations among items
2) number of items
it’s not a measure of dimensionality or uni dimensionality
(i.e. a high alpha may not indicate a single underlying construct).
test re test
scores on at one time point correlate with rest at another time point
however lower test re test reflect poor temporal stablility in the underlying construct (e.g. personality might not change but stress might)
content validity
do the content of the items reflect the variable of interest
can judge this through interviews of members of a target population to devise items (pilot study)
or ask experts
construct validity
does the scale relate to another construct (based on theoretical expectation) e.g. anxiety and depression
or does it correlate with an established measure of the same construct