Reliability and Validity Flashcards

1
Q

Types of reliability - reliability across time

A

· Test-retest reliability
· Involves two administrations of the scale
- Assumes that the construct is stable across time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of reliability - internal consistency:

A

· Split-half reliability
· Cronbach’s alpha
· McDonalds omega hierarchal and omega total
· Involves only one administration of the scale
- How most papers test reliability
- What we will be doing for the lab report

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Reliability across time - test-retest:

A

· Test-retest reliability:
- The consistency of your measurement when it is used under the same conditions with the same participants
- Example procedure: Administer your scale at two separate times for each participant. Compute the correlation between the two scores
· Test-retest reliability doesn’t work if you’re studying a construct that is expected to vary across time points.
· Example: Mood

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Internal consistency - split-half:

A

· Split-half reliability:
- One of the most straightforward ways of testing internal consistency
- You can split a scale into two halves (e.g., a 6-item scale would be split into two sets of 3)
- You calculate a composite score for each half of the scale (e.g., each participant gets a score averaged across the 3 items)
- Calculate the correlation between those two half-scale scores; strong correlation = high split-half reliability
BUT: reliability will depend on exactly how you split the data!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Refining split-half reliability:

A

· We can refine the split-half method
· We can split the items on the scale every possible way and compute correlations for all splits
· We can obtain an average of all these correlations to give us a sense of the scale’s internal consistency
- This is roughly what Cronbach’s α does!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Interpreting Cronbach’s alpha:

A

· Interpret as Pearson’s r:
- Varies from 0 (no internal consistency) to 1 (perfect internal consistency)
- No negative values! If you get a negative value, something has gone wrong
· Rule-of-thumb: Acceptable reliability:
· HOWEVER: Depends on the type of construct, progress of the research, etc.
- For cognitive tests, α > 0.8 is appropriate; for other tests α > 0.7 is fine (Kline, 1999)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Other statistics that help interpret Cronbach’s alpha:

A

· α if item removed
- Calculates α as described but leaves out each item one at a time
- If α improves -> scale is more reliable without it
- Helps identify the worst item (to consider getting rid of it)
· Item-total correlation
- = Correlation between the score on an item and score on the scale as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cronbach’s alpha - a history of (mis)use:

A

· Cronbach’s alpha is widely used
- 87% of papers reporting any measure of internal consistency, reported Cronbach’s alpha
- The where Cronbach’s alpha was first introduced (Cronbach, 1951) is one of the most cited English language research articles of any discipline (just under 65000 citations)
- And this is a conservative indicator of its popularity! Many report Cronbach’s alpha without citing the original paper
· Sources of misuse:
- Cronbach’s alpha makes assumptions about the shape of the factor model (i.e., how items relate to factors); if these assumptions are not met, Cronbach’s alpha is misleading
- Using Cronbach’s alpha as evidence of a scale’s dimensionality: a big no-no!
- Cronbach’s alpha is sensitive to the number of items in a scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The assumptions of alpha - tau-equivalence:

A

· Alpha assumes tau-equivalence. This corresponds to a factor model with the following features:
- Items have equal loadings
- Items indicate only one factor
- This is unrealistic. Ideally, items have strong primary factor loadings, but they still have loadings on other factors even if they are weak (and even if we ignore them!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Alpha and unidimensionality:

A

· Alpha is designed to be computed for unidimensional scales (i.e., scales with only one factor); doesn’t work properly otherwise! But you can’t use it as a measure of unidimensionality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cronbach’s alpha is sensitive to number of items:

A

· Having more items in a scale leads to a higher Cronbach‘s alpha, regardless of the actual internal consistency of the scale
· Measures of internal consistency should tell us how strongly items relate to one another: high inter-item correlations indicate high internal consistency. But that‘s not always the case for Cronbach‘s alpha…
· When looking at scales with Cronbach’s alpha of .80, Cortina (1993) found that:
- A scale with 3 items had an average inter-item correlation of .57
- A scale with 10 items had an average inter-item correlation of .28
· In other words: the 10-item scale has worse internal consistency (i.e., low inter-item correlations) than the 3-item scale but Cronbach’s is identical
- You can trick alpha to suggest good internal consistency simply by adding more items to your s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

McDonald’s omega:

A

· Great alternatives to Cronbach’s alpha
· Omega (both total and hierarchical) does NOT assume tau-equivalence or unidimensionality
- Omega uses the factor structure obtained by running a factor analysis (see last week’s lecture!)
· Omega (both total and hierarchical) assume the existence of a general factor
- Usually, not a problem in scale development: we assume that even items belonging to different factors are related! After all, they were all designed to capture the same construct (before we might have learned that our construct is actually a set of related constructs)!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Omega - assumed factor structures:

A

· Omega hierarchical ωh
- Appropriate for unidimensional scales: items share variance with a general factor
· Omega total ωt
- Appropriate for multidimensional scales: items share variance with both the extracted factors AND the general factor (an overarching, higher order factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

McDonalds omega - testing assumptions:

A

· Both Omega hierarchical ωh and Omega total ωt assume that there is a general shared factor
· The R output shows you a table of factor loadings where the first column shows each item’s loadings with the general factor
· The second column shows you each item’s loadings for each of the factors extracted in your factor analysis (in this case 4)
· These loadings should be similar to the ones obtained in your factor analysis but they are not identical—unlike your original factor model, this model also includes the general factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interpreting omega scores:

A

· Omega is interpreted in the same way as alpha: 0 (no internal consistency) to 1 (perfect internal consistency)
· Omega hierarchical is smaller than omega total
- Makes sense! We have a multidimensional scale (i.e., more than 1 factor) so the omega hierarchical which tries to fit a single-factor model will
· You should also look at the omega total scores for each of your factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Reverse-coding items:

A

· Both Cronbach’s alpha and McDonald’s omega (both hierarchical and total) assume that your items are all coded in the same direction
· In other words: having a high score on your scale items means the same thing (e.g., having high levels of the target construct)
· Reverse-phrased items measure the same idea but in opposite directions
· If you agree that statistics make you cry, then you are unlikely to agree that standard deviations excite you

17
Q

Reverse-coding and composite scores:

A

· Composite score = a single score obtained by aggregating (e.g., summing, averaging) the items from a scale
- This is the scale score (intended to represent a participants’ level of the target construct)
- This is what we use when testing relationships between our measure with other measures; and, based on this, we infer relationships between our target construct

18
Q

An interesting case - composite scores and multiple factors:

A

· A scale may capture one construct or several related constructs (factors)
- We need separate composite scores for each factor
- Each factor’s composite score will be obtained by aggregating (e.g., summing, averaging) the items belonging to each factor

19
Q

External validity:

A

· Many types of scale validity, but here, we will focus on external validity: how does our scale compare to other existing scales. This is sometimes called construct validity and has two subtypes:
· Convergent validity
- Tests whether constructs that should be related in theory are related in reality (i.e., their measures correlate)
- Example: You would expect pro-environmental attitudes to relate to recycling behaviour
· Discriminant validity
- Tests whether constructs that should NOT be related in theory are NOT related in reality (i.e., their measures don’t correlate)
Example: You would expect pro-environmental attitudes to relate to recycling behaviour

20
Q

Construct validity in R:

A

· Fear of statistics was significantly negatively correlated with average stats grade (r = -.28, p = .031)
- Convergent validity
· Fear of statistics was not significantly correlated with average grade on non-stats modules (r = .02, p = .332)
- Discriminant validity