Week 11 and 12: Reliability and Validity Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Define reliability

A

Consistency in measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

List 3 ways that consistency of scores occurs when re-examining the same people

A
  • the same test on different occasions
  • different set of items measuring the same thing
  • different conditions of testing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is standard error of measurement?

A

An estimate of the amount of error usually attached to an examinee’s obtained score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a confidence interval

A

Confidence that you have that the population mean is within that interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some sources of random error?

A
  • test construction
  • test administration
  • test scoring and interpretation
  • test construction error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

List the ways of testing reliability

A
  • cronbach’s alpha
  • test retest
  • split half
  • item total correlations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How big should a reliability coefficient be?

A

Above .8, preferably .9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does cronbach’s alpha measure

A

A set of all possible correlations between test items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is split half reliability

A

Taking half the items and seeing how they correlate with the other half

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are item total correlations

A

Getting the item and comparing it to the rest of the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is test-retest reliability

A
  • correlation between two testing intervals
  • stability over time
  • uses Pearson’s r
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are some problems with test-retest reliability

A
  • affected by factors associated with how the test is administered on each occasion
  • carryover effect: remember answer, practice effect
  • should only be used for meaningful data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal consistency

A

The correlations between different items on the same test, or with the entire test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kuder-richardson reliability and coefficient alpha

A
  • based on the intercorrelations among all comparable parts f the test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Kuder-richardson formula 20

A
  • calculated by the proportion of people who pass and fail each item and the variance of the test scores
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Inter-rater reliability

A
  • agreement through multiple raters

- measured using a kappa statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Kappa statistic

A

Measures inter rater agreement for qualitative (categorical) items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Parallel-forms reliability

A

Equivalent forms of the same test are administered to the same group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Types of reliability

A
  • inter-rater
  • test-retest
  • split half
  • parallel forms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Validity

A

The extent to which a test measures what it is supposed to measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the three types of validity

A
  • content
  • criterion related
  • construct
22
Q

Content validity

A

Degree to which content (items) represents behaviour/characteristics associated that trait

23
Q

What are the two types of criterion validity

A

Predictive and concurrent

24
Q

What is criterion validity

A

The relationship between test scores and some type of criterion or outcome, such as ratings, classifications or other test scores

25
Q

Concurrent validity

A

Refers to whether the test scores are related to some CURRENTLY AVAILABLE criterion measure

26
Q

Predictive validity

A

The correlation between a test and criterion obtained at a FUTURE time e.g. ATAR scores predicting success at uni

27
Q

Validity coefficient

A

Correlation between test scores and some criterion

28
Q

What are the two types of construct validity?

A

Convergent and discriminant

29
Q

Construct validity

A

The extent to which a test measures a psychological construct or trait

30
Q

Convergent validity

A

Convergent validity takes two measures that are supposed to be measuring the same construct and shows that they are related.

31
Q

Discriminant validity

A

Discriminant validity shows that two measures that are not supposed to be related are in fact, unrelated.

32
Q

List the types of reliability

A
  • test-retest
  • internal
  • interrater
33
Q

In test-retest reliability, what are some sources that might affect a result?

A
  • time
  • place
  • mood
  • temperature
  • noise
34
Q

What are some core issues with content validity?

A
  • the appropriateness of the questions and domain relevance
  • comprehensiveness
  • level of mastery assessed
35
Q

What are some procedures to ensure content validity?

A
  • specialist panels to map content domain
  • accurate test specifications
  • communication of validation procedures in test manual
36
Q

What are some applications of content validity?

A
  • achievement and occupational tests

- usually not appropriate for personality or aptitude tests

37
Q

What is standard error

A

The population level of standard deviation

38
Q

Do we want small or large SEM

A

Small, because larger lowers reliability and increases confidence intervals

39
Q

Which confidence level is most common?

A

z = 1.96 (95%)

40
Q

Why are confidence intervals better than p-values

A
  • p value is a random arbitrary number
  • p values are biased towards high samples
  • p-values don’t pick up on small effects that reoccur consistently
41
Q

When is a confidence interval result significant

A

When the confidence interval doesn’t overlap 0

42
Q

Why are effect sizes beneficial?

A

They address significant affects that don’t mean much in real life e.g. does someone .5 higher on depression really have a worse time

43
Q

How can test scoring and interpretation be a source of random error (reliability)?

A

Because projective tests are all answered differently, there is a large role for inter rater disagreement e.g. TAT, rorschach

44
Q

What is the domain sampling model?

A

Test items represent a sample of all possible items

45
Q

What is the reliability ratio

A

Variance of observed score on test divided by variance of true score on long test

46
Q

How many items should you have for optimal reliability

A

10

47
Q

List some examples of concurrent validity

A
  • depression scale and clinical interview
  • 2 measures at a similar time
  • IQ and exam scores
48
Q

To be concurrently valid what kind of assessments should the measure be correlated with

A

The gold standard

49
Q

What kind of test do you use for predictive validity

A

Multivariate ANOVA

50
Q

What test do you use for convergent validity

A

Factor analysis

51
Q

The lower reliability, the…

A

Higher the error in a test

52
Q

The larger the Standard error of measurement, the

A

Less precise measurements and larger confidence intervals