Scale Development Flashcards
Learning objectives
Overview:
- Measurement scales and latent variables
- Reliability
- Validity
- Guidelines for developing a scale
- Using SPSS to develop a scale
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Learning objectives:
1. Understand and be able to explain… measurement scales and latent variables.
2. Understand and be able to explain… the difference between the reliability and validity of a scale.
3. Understand and be able to explain… different ways to assess the reliability and validity of a scale.
4. Understand and be able to explain… the different steps to follow to develop a reliable and valid scale.
5. Be able to… conduct, report and interpret reliability and validity analyses.
The issue – how to measure a psychological variable using a questionnaire?
Simply putting together questions to measure a construct for which there is no pre-existing scale leaves the nagging doubt that the items may not be reliable or validly measuring what we aim to investigate.
We can’t directly measure psychological phenomena such as attitudes, so we used questionnaire items.
Scales and latent variables
Y ——–> X1
Reliability
A reliable scale is a scale in which variation in scale scores can be attributed to a latent variable that exerts a causal influence over all the items
i.e. if people respond similarly to all the items, the scale is reliably measuring the effect of the latent variable, all the items are being causally influenced in a similar way by the latent variable
Measures of reliability
List them (3)
- Split-Half Reliability
- Internal Consistency
- Test-Retest Reliability
Measures of reliability
- Split-Half Reliability
- Split-Half Reliability
e. g. where scores on the first half of the scale are correlated with scores on the second half of the scale.
e. g. where scores on the odd items are correlated with scores on the even items.
If the scale is reliable, the two halves of the scale should correlate strongly.
Avoids fatigue effects, can see if this is the case easily
Measures of reliability
- Internal consistency
- Internal consistency
Coefficient Alpha (α) indicates the proportion of the variance in the scale scores that is attributable to the true score (score that would be obtained given no error) - how much variance is a result of the latent variable alone.
Coefficient alpha is based on correlations between each scale item score and the total score.
It provides a more accurate measure of internal reliability than a split-half correlation
Scores - goes from 0->1, closer to 1 the better. DeVellis recommends that an α of .60 or below is unacceptable, .80-.90 is seen as very good. Above .90 means the scale can likely be shortened.
Diadvantages of internal consistency:
- Coefficient alpha is dependent not only on the magnitude of the correlations among items, but also on the number of items in the scale. Scales with more items have higher alpha coefficients, the effect of unreliable items is diminished.
- Coefficient alpha is not a measure of dimensionality, nor a test of unidimensionality. A high alpha coefficient cannot necessarily be taken to reflect unidimensionality (might be measuring more than one latent variable, we wouldn’t know)
Measures of reliability
- Test-Retest Reliability
- Test-Retest Reliability
Scores on the test at one point in time are correlated with scores on the test at another point in time.
If the scale has high test-retest reliability, a strong correlation will be found between the two administrations of scale.
***However, low test-retest reliability may reflect temporal instability in the underlying latent variable (e.g. exam stress 4 and 2 weeks before an exam).
Validity
A valid scale is a scale which measures the latent variable of interest (its measuring what we want it to)
You might have a reliable scale but not a valid one i.e. it is accurately measuring a construct but not the one we think it is
Measures of validity
List them (3)
- Content validity
- Construct validity
- Predictive validity
Measures of validity
- Content validity
- Content validity
To what extent does the content of the scale items reflect the latent variable of interest?
e. g. conduct interviews with members of the target population to generate items for the scale
e. g. ask experts to rate the extent to which each item reflects the proposed latent variable of interest
…Do the items look like they’re measuring the right construct?
Increasing face validity –> interviews with target pop or ask experts to rate the items
- Construct validity
Does the scale relate to other constructs in line with theoretical expectations?
e. g. administer the scale to a sample of participants along with another related measure. A strong correlation would indicate construct validity.
e. g. a measure of personal strengths would be expected to correlate with a measure of self-esteem.
We expect the scale to correlate reasonably well with some related constructs e.g. depression with anxiety.
- Predictive validity
Does the scale have an association with some other external criterion?
e. g. correlate scores on the scale with an external criterion/measure.
e. g. a measure of medication beliefs would be expected to correlate with an objective measure of medication adherence.
Association with some other gold standard? Behaviour/ other scale?
Guidelines for Scale Development (DeVellis, 1991)
Guidelines for Scale Development (DeVellis, 1991)
- Define the latent variable of interest.
- Generate an item pool
- brainstorm, construct interviews, pluck from other measures - Review items for content.
- reliable? valid? reduce number - Administer items to a development sample.
- choose best items - Evaluate the items…
- reverse coding (pos-neg coding, same direction of scores must mean same thing)
- mean and SD (mean in middle? adequate variation?)
- skewness and kurtosis (normal distribution?) - Compute coefficient alpha.
- reliability - Validate the scale.
- e.g. running other studies etc
Skewness and Kurtosis
Skewness is where the hump of the bell curve is off to either side.
Kurtosis is where the hump of the bell curve is either super thin or really flat and wide/ non-existent
Both represent abnormal distribution