Week 11 and 12: Reliability and Validity Flashcards

1
Q

Define reliability

A

Consistency in measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

List 3 ways that consistency of scores occurs when re-examining the same people

A
  • the same test on different occasions
  • different set of items measuring the same thing
  • different conditions of testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is standard error of measurement?

A

An estimate of the amount of error usually attached to an examinee’s obtained score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a confidence interval

A

Confidence that you have that the population mean is within that interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some sources of random error?

A
  • test construction
  • test administration
  • test scoring and interpretation
  • test construction error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List the ways of testing reliability

A
  • cronbach’s alpha
  • test retest
  • split half
  • item total correlations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How big should a reliability coefficient be?

A

Above .8, preferably .9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does cronbach’s alpha measure

A

A set of all possible correlations between test items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is split half reliability

A

Taking half the items and seeing how they correlate with the other half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are item total correlations

A

Getting the item and comparing it to the rest of the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is test-retest reliability

A
  • correlation between two testing intervals
  • stability over time
  • uses Pearson’s r
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some problems with test-retest reliability

A
  • affected by factors associated with how the test is administered on each occasion
  • carryover effect: remember answer, practice effect
  • should only be used for meaningful data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal consistency

A

The correlations between different items on the same test, or with the entire test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kuder-richardson reliability and coefficient alpha

A
  • based on the intercorrelations among all comparable parts f the test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Kuder-richardson formula 20

A
  • calculated by the proportion of people who pass and fail each item and the variance of the test scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Inter-rater reliability

A
  • agreement through multiple raters

- measured using a kappa statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Kappa statistic

A

Measures inter rater agreement for qualitative (categorical) items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Parallel-forms reliability

A

Equivalent forms of the same test are administered to the same group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Types of reliability

A
  • inter-rater
  • test-retest
  • split half
  • parallel forms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Validity

A

The extent to which a test measures what it is supposed to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the three types of validity

A
  • content
  • criterion related
  • construct
22
Q

Content validity

A

Degree to which content (items) represents behaviour/characteristics associated that trait

23
Q

What are the two types of criterion validity

A

Predictive and concurrent

24
Q

What is criterion validity

A

The relationship between test scores and some type of criterion or outcome, such as ratings, classifications or other test scores

25
Concurrent validity
Refers to whether the test scores are related to some CURRENTLY AVAILABLE criterion measure
26
Predictive validity
The correlation between a test and criterion obtained at a FUTURE time e.g. ATAR scores predicting success at uni
27
Validity coefficient
Correlation between test scores and some criterion
28
What are the two types of construct validity?
Convergent and discriminant
29
Construct validity
The extent to which a test measures a psychological construct or trait
30
Convergent validity
Convergent validity takes two measures that are supposed to be measuring the same construct and shows that they are related.
31
Discriminant validity
Discriminant validity shows that two measures that are not supposed to be related are in fact, unrelated.
32
List the types of reliability
- test-retest - internal - interrater
33
In test-retest reliability, what are some sources that might affect a result?
- time - place - mood - temperature - noise
34
What are some core issues with content validity?
- the appropriateness of the questions and domain relevance - comprehensiveness - level of mastery assessed
35
What are some procedures to ensure content validity?
- specialist panels to map content domain - accurate test specifications - communication of validation procedures in test manual
36
What are some applications of content validity?
- achievement and occupational tests | - usually not appropriate for personality or aptitude tests
37
What is standard error
The population level of standard deviation
38
Do we want small or large SEM
Small, because larger lowers reliability and increases confidence intervals
39
Which confidence level is most common?
z = 1.96 (95%)
40
Why are confidence intervals better than p-values
- p value is a random arbitrary number - p values are biased towards high samples - p-values don't pick up on small effects that reoccur consistently
41
When is a confidence interval result significant
When the confidence interval doesn't overlap 0
42
Why are effect sizes beneficial?
They address significant affects that don't mean much in real life e.g. does someone .5 higher on depression really have a worse time
43
How can test scoring and interpretation be a source of random error (reliability)?
Because projective tests are all answered differently, there is a large role for inter rater disagreement e.g. TAT, rorschach
44
What is the domain sampling model?
Test items represent a sample of all possible items
45
What is the reliability ratio
Variance of observed score on test divided by variance of true score on long test
46
How many items should you have for optimal reliability
10
47
List some examples of concurrent validity
- depression scale and clinical interview - 2 measures at a similar time - IQ and exam scores
48
To be concurrently valid what kind of assessments should the measure be correlated with
The gold standard
49
What kind of test do you use for predictive validity
Multivariate ANOVA
50
What test do you use for convergent validity
Factor analysis
51
The lower reliability, the...
Higher the error in a test
52
The larger the Standard error of measurement, the
Less precise measurements and larger confidence intervals