Stats - reliability Flashcards

1
Q

Name 5 types of reliability

A

1) Test-Retest
2) Inter-Rater
3) Internal consistency
4) Parallel forms
5) Split-Half

SPIT-I

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What type of reliability?

Assesses the stability of a measure over time. The same test is administered to the same group at two different points in time, and the scores are compared. High test-retest reliability indicates that the measure produces consistent results across time intervals. For instance, an anxiety questionnaire should yield similar scores if administered to the same patient over short, stable periods.

A

Test Retest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of reliability?
Measures the consistency of test scores between different raters or observers. It is particularly relevant in clinical settings where different clinicians might assess the same patient. For example, a high inter-rater reliability in a structured psychiatric interview indicates that different clinicians are likely to arrive at similar diagnoses for the same patient.

A

Inter-rater

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of reliability?
Assesses how consistently items within a test measure the same construct. It is often evaluated using statistical measures like Cronbach’s alpha. For example, in a depression inventory, items assessing sadness, fatigue, and hopelessness should all correlate well with each other if they are truly measuring the same underlying construct of depression

A

Internal Consistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of reliability?
Determined by administering two equivalent forms of a test to the same group. If both forms yield similar results, the test is considered reliable. This is useful for assessing constructs that might be influenced by practice effects, such as cognitive function, where exposure to the same items may lead to improved scores on retesting.

A

Parallel forms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of reliability?
Assesses the consistency of test scores by dividing the test into two halves (e.g., odd and even items) and comparing scores on each half. High split-half reliability indicates that the test measures the construct consistently throughout its items. This method is particularly useful in long assessments, where item consistency can be confirmed without requiring repeated administrations.

A

Split-Half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are 4 things that can affect reliability?

A

Measurement error: Inconsistent results can arise from errors in measurement, such as poorly calibrated tools or unclear test instructions. Measurement error reduces reliability by introducing variability unrelated to the construct being measured.

Sample variability: Highly heterogeneous samples may show lower reliability due to individual differences affecting scores, while highly homogeneous samples may exaggerate reliability as they yield similar responses.

Testing conditions: Changes in testing conditions, such as room environment, time of day, or examiner behaviour, can affect reliability. Standardised testing conditions are essential to maintain reliability in psychiatric and psychological assessments.

Practice effects: For assessments where familiarity with the test content can influence results, repeated administration may lead to practice effects that artificially inflate reliability. This is particularly relevant for cognitive and neuropsychological testing.

MPST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does this describe, in relation to things that can affect reliability?

Inconsistent results can arise from errors in measurement, such as poorly calibrated tools or unclear test instructions. Measurement error reduces reliability by introducing variability unrelated to the construct being measured.

A

Measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does this describe, in relation to things that can affect reliability?

Highly heterogeneous samples may show lower reliability due to individual differences affecting scores, while highly homogeneous samples may exaggerate reliability as they yield similar responses.

A

Sample Variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does this describe, in relation to things that can affect reliability?

Changes such as room environment, time of day, or examiner behaviour, can affect reliability. Standardised testing conditions are essential to maintain reliability in psychiatric and psychological assessments.

A

Testing Conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does this describe, in relation to things that can affect reliability?

For assessments where familiarity with the test content can influence results, repeated administration may lead to practice effects that artificially inflate reliability. This is particularly relevant for cognitive and neuropsychological testing.

A

Practice Effects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Kappa?
What range can it take?
What type of data can it be used on?

A

The Kappa statistic (Cohen’s Kappa coefficient) is a widely used measure to assess the magnitude of agreement between two independent observers or raters, accounting for agreement occurring by chance.

Kappa values range from -1 to 1:

1 indicates perfect agreement.
0 indicates agreement equivalent to chance.
-1 indicates perfect disagreement.

Kappa is used when assessing categorical data (e.g., diagnostic classifications) and is applicable in any situation with two or more independent observers evaluating the same phenomenon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How are Kappa scores interpreted?

A

nterpretation of Kappa Scores: While there are no universal cut-offs, the following interpretation guidelines are commonly used, based on the degree of agreement:

< 0 Poor agreement (less agreement than expected by chance).
0.01 – 0.20 Slight agreement.
0.21 – 0.40 Fair agreement.
0.41 – 0.60 Moderate agreement.
0.61 – 0.80 Substantial agreement.
0.81 – 1.00 Almost perfect (near-complete) agreement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are 3 limitations to Kappa?

A

Kappa is sensitive to prevalence. When the prevalence of a condition is very high or very low, Kappa may underestimate or overestimate agreement.

Unequal distributions (skew) of categories (e.g., one diagnosis much more common than others) can distort Kappa values.

It requires independent observations; Kappa cannot account for systematic bias between observers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly