Reliability and coefficient alpha Flashcards

1
Q

What is reliability?

A

The desired consistency or reproductibility of test scores

-does my test give me the same accurate measurement each time?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Test score theory

A

every person has a true score that we can measure but no test is free from error

x=T+e (x=observed score, T= true score, e= error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Classical test theory: 4 assumptions

A
  1. each person has a true score we could obtain if there was no measurement error
  2. there is measurement error but this error is random
  3. the true score of an individual doesn’t change with repeated applications of the same test even though their observed score does
  4. the distribution of random errors will be the same for all ages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Classical test theory:

The domain sampling model

A

If we construct a test on something, we can’t ask all possible questions

  • > So we only use a few test items (sample)
  • Using fewer test items can lead to the introduction of error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The domain sampling model

formula

A

reliability = variance of observed scores on short test/variance of true scores

As the sample gets larger, estimate is more accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Other things can affect performance…

A
  • might be tired on day taking the test (different scores for different days)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of reliability

A
  1. test-retest reliability
  2. parallel forms reliability
  3. internal consistency reliability (split half, Kuder-Richards 20, Cronbach’s alpha)
  4. inter-rater reliability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Test-retest reliability

A
  • Give someone the same test at two different points in time.
  • If the scores are highly correlated, we have good test-retest reliability
  • Correlation between the 2 scores also known as the co-efficient of stability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Source of error in test-retest reliability

A

time sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Issues with test-retest reliability

A

Can we use it when measuring things like mood, stress, etc.?

Won’t the person’s score increase the 2nd time because of practice effect?

What if we want to measure changes between 1st and 2nd administration?

Can the actual experience of being tested change the thing being tested?

What if some event happens in between the 1st and 2nd administration to change the thing being tested?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Parallel forms reliability

A
  • Two different forms of the same test (i.e., measuring the same construct)
  • Correlation between the two forms known as the co-efficient of equivalence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Parallel forms reliability- source of error

A

item sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Parallel forms reliability- Ways to change the form of test

A
  • question response alternatives are reworded
  • order is changed (reduce practice effect)
  • change wording of question
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Parallel forms reliability: issues

A

What if we give the different forms to people at two different times?

Do we give the different forms to the same people, or different people?

What if people work out how to answer the one form from doing the other form?

Difficult to generate a big enough item pool

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Internal consistency reliability

A

Do the different items within one test all measure the same thing to the same extent?
I.e., Are items within a single test highly correlated?
Split-half reliability
Coefficient alpha

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Internal consistency reliability: source of error

A

-internal consistency/reliability of on test administered on one occasion

17
Q

Split-half reliability

A

A test is split in half
Each half scored separately
Total scores for each half correlated

18
Q

Split-half reliability- advantage

A
  • we only need 1 test- don’t need 2 forms
19
Q

Split-half reliability- disadvantage

A

-challenging to divide the test into equal halves

20
Q

SPEARMAN-BROWN CORRECTION

A

solves the problem of split half tests having reduced reliability compared to the total test

rsb= 2r(hh)/1+r(hh)

r(sb)- predicted reliability
r(hh)- reliability of the current test(correlation between halves)

21
Q

Split-half reliability: issues

A
  1. We have taken one test and split it into two tests that are half the length – won’t this underestimate reliability?

Example: We have a test of 20 items, split in half, and correlate each half
Similar to 2 tests of 10 items
The fewer items we have, the lower our reliability

  1. Won’t the correlation change each time depending on which items we put in each half?

Yes – we will get a different reliability coefficient for each different split

Ideally the halves should be equivalent

22
Q

Coefficient/cronbach’s alpha

A

Takes the average of all possible split-half correlations for a test

a=kr/(1+(k-1)r)

k- number of indicators
r- mean inter-rater correlation

23
Q

cronbach’s alpha

-number of items

A

Rapid increase in internal consistency reliability from 2 to 10 items
Steady increase from 11 to 30
Tapers off after about 40 items

24
Q

Interpreting cronbach’s alpha

A
  1. 00 = no consistency in measurement
  2. 00 = perfect consistency in measurement
  3. 70 = exploratory research
  4. 80 = basic research
  5. 90 = applied scenarios
25
Q

Cronbach’s alpha can be affected by:

A

Multidimensionality
Bad test items
Number of items

26
Q

inter-rater reliability

A

Measures how consistently 2 or more raters/judges agree on rating something

27
Q

Cohen’s kappa &Fleiss’ kappa

A

Cohen-2 raters/judges

Fleiss- more than 2 raters/judge

Ranges from 1 (perfect agreement) to -1
>.75 excellent agreement
.50-.75 satisfactory

28
Q

source of error of split half, alpha and Kr-20?

A

internal consistency

29
Q

source of error of Kappa?

A

observer differences

30
Q

What is coefficient alpha?

A

-Coefficient (Cronbach’s) alpha is one way of calculating the reliability of a test
-Specifically, it tells us about the internal consistency of a test
-Coefficient alpha measures the error associated with each individual test item
As well as error associated with how well the test items fit together

31
Q
What is coefficient alpha? 
 according to (cortina, 1993)
A

The mean of all split-half reliabilities

A measure of first-factor saturation

32
Q

Cronbach’s alpha vs standardised item alpha

A

Cronbach’s alpha
Deals with variance and covariance (the amount by which variables/items vary together, i.e., co-vary)

Standardized item alpha
Deals with the interim correlations (i.e., the correlation of each item with every other item)and the sum of the correlation matrix of each item with each other

33
Q

When do we use Cronbach’s alpha?

A
  • used when you want to use raw scores (the actual score)

- item variance affects the score and CA takes this into account

34
Q

when do we use Standardised item alpha?

A

Standard scores are scores that have been transformed by taking into account things like age, in particular
E.g., raw scores on any WAIS subtest are converted to standard scores with a mean of 10 and a SD of 3

35
Q

Cronbach’s alpha is calculated by…

A

using inter-item variance and co-variance

36
Q

Variance

A

a measure of how scores vary

  • usually taken as a measure of error
  • Because if scores differ a lot between people, the item or test is not very accurate – less accuracy means more error
37
Q

Covariance

A

a measure of how much scores on items go together
E.g., if a person with a high score on Item 1 of a test also gets a high score on Item 2 of a test
Then items 1 and 2 of the test have a lot of co-variance (shared variance)