reliability cont. Flashcards

1
Q

summary of sources of measurement error

A
  • Time Sampling (when the test is given)
  • Item Sampling (which items were selected)
  • Internal Consistency (whether the items are all measuring the trait of interest)
  • Inter-rater Differences (whether different raters assign the same score)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

explain item sampling

A
  • DOMAIN = infinite pool of potential items
  • Any test must sample items
  • The process of sampling items introduces error, because we cannot be certain that we sampled the items randomly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

explain alternate form

A
  • Two parallel (alternate) forms of the same test are constructed
  • Each one is the same length
  • Equivalent (but not identical) items on each
The forms (A and B) are given to the same sample of examinees on the same day
-Order is counterbalanced

-Correlation between the scores on Form A and the scores on Form B is known as the alternate form reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what does it mean if you have alternate form reliability

A

Ð Results are not due to error, test is a reliable construct, error that is involved is under control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

problem and solution with alternate form reliability

A
  • Difficulty creating alternate forms for some tests
  • Single test split in half
  • This is known as the split-half method.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is split half method

A
  • How do we split the test?
  • Odd/even numbered items
  • Typically do not do first half and second half because on some tests the second half is harder
  • Can’t be used with speed tests (e.g., Coding)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

problem with split half reliability

A
  • Reliability is related to test -All other things being equal, longer tests have higher reliability than shorter tests (more observations on longer tests, more opportunity for the +/- errors to cancel out)
  • The split-half method will underestimate the Alternate Form Reliability (and consequently overestimate the amount of error associated with item sampling)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

solution to the problem with split half reliability

A

Spearman-Brown Formula

-Enables us to predict what the Alternate Form Reliability would be from the Split Half Reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Spearman Brown Formula in words

A
  • Step 1. Calculate n (New Test Length divided by Old Test Length)
  • Step 2. Multiply this by the current (“old”) test reliability
  • Step 3. Subtract 1 from n (Step 1) and multiply this by the current (“old”) test reliability
  • Step 4. Add 1 to Step 3.
  • Step 5. Divide the result of Step 2 by the result of Step 4.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does the general form of Spearman Brown Formula allow us to estimate

A
  • what the reliability of the test would be if we added items to the test
  • what the reliability of the test would be if we deleted items from the test
  • how many items we would have to add to the test in order to achieve a desired reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

explain reliability and test length

A
  • When we increase the length of the test from 100 to 120 items, the reliability INCREASES from .90 to .915
  • When we decrease the length of the test from 100 to 80 items, the reliability DECREASES from .90 to .878
  • Reliability is related to test length
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to use SBF to Estimate how many Items to Add

A

Rearrange the equation and solve for n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

SBF to estimate how many items to add in words

A
  • Step 1. Subtract the current reliability (rtt) from 1
  • Step 2. Multiply result of Step 1 by the desired reliability (rnn)
  • Step 3. Subtract the desired reliability from 1
  • Step 4. Multiply the result of Step 3 by the current reliability
  • Step 5. Divide the result of Step 2 by the result of Step 4.
  • Finally, multiply result of Step 5 (n) by the current test length to get the length of the test needed to get the desired reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

caution for SBF

A
  • The items that are ADDED or ELIMINATED must not change the test
  • The added items must be selected from the same domain, i.e., they must be EQUIVALENT in terms of measurement properties to the original items
  • The deleted items must be deleted RANDOMLY
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is internal consistency

A
  • A group of items (i.e., scale) is homogeneous or internally consistent when all the items are measuring the same construct equally well
  • BUT items are usually not equally good measures of the construct which introduces error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how can we be sure if the total score accurately reflects the standing on the construct regardless of which specific items were passed or failed?

A
  • The total score is an accurate measure of the construct if the scale is internally consistent, i.e., the items are interchangeable as equally good measures of the construct
  • If the scale lacks internal consistency, then we can’t be sure that the total score on the scale always has the same meaning
17
Q

ways of assessing internal consistency

A

Inter-item correlation (Will not use)

Item-total correlation (will not use)

Formulas

  • Kuder-Richardson
  • Cronbach’s Alpha
18
Q

four kinds of correlations

A

Pearson, point-biserial, phi coefficent, spearman rho

19
Q

pearson correlation

A

continuous (interval, ratio) and continuous

20
Q

point biserial

A

continuous and binary (nominal)

21
Q

phi coefficient

A

binary and binary

22
Q

spearman rho

A

ranks (ordinal) and ranks

23
Q

inter-item correlation

A
  • The correlation between each pair of items on the scale is calculated
  • If there are n items on the test, then there are n(n-1)/2 unique correlations.
  • These are phi coefficients (binary X binary)

The mean of these correlations is a measure of the scale’s internal consistency

24
Q

item total correlation

A

Correlation between the item score (0, 1) and the total score on the test

  • If there are n items, there will be n item-total correlations
  • This is the point-biserial (binary X continuous)

The mean of these n item-total correlations is a measure of the scale’s internal consistency

25
Q

when can kuder richardson be used

A

binary items only

26
Q

explain SDi^2 in Cronbach’s alpha equation

A
  • Most common statistic for estimating the internal consistency of a scale
  • Values can range from 0 to 1
  • Larger values = greater internal consistency
  • Alpha is the average of all possible split-half correlations
  • What’s the minimum acceptable value?
  • Certainly not lower than 0.6
  • Low Alpha = consider constructing smaller subscales (factor analysis)
27
Q

caution with cronbach’s alpha

A

Very large values for alpha should be viewed with suspicion

Items could be redundant, i.e., identical,

  • sampling the same behavior using different words
  • I am rarely sad; I am usually happy

In this case, the inter-item correlations will also be very high (approaching 1.00)

Redundancy is LESS likely if:

  • Alpha is large
  • Item-item correlations are only moderately large