lecture 17 Flashcards

1
Q

What do measures attempt to quantify? Example?

A

Measures attempt to quantify the “true value” of a latent (or hidden) psychological construct

“How extroverted are you?”

the way the construct would be presented to us if we assessed the construct perfectly. ExThe measurement would show exactly what it is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Are humans stochiastic?

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what does stochiastic mean?

A

functionally stochastic (unpredictable in ways we don’t understand)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Will measures ever be perfectly reliable?

A

We humans are not perfectly reliable
Thus, measures will never be perfectly reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is measurement error?

A

measurement error is the difference between what we see and the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the best we can do?

A

§ Best we can do is estimate psychological constructs:

combining a quess about true value and measurement error. We will never know what mixture this is. We won’t know how much true value and measurement error there is.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the goal around measurement error?

A

minimize measurement error on average and hopefully maximize true value on average. This limits what we can say about the individual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What example was provided for measurement error?

A

assume that a person knows for certain their true value is 7.5 out of ten, but the scale you provided them doesn’t go to half numbers, already the measurement instrument is forcing the participant into measurement error. AKA we are getting a half unit of measurement error already.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

give examples of sources of measurement error.

A

“Should I select 7 or 8?”
Response: 7

“Oops, flipped the scale options”
Response: 2 (but meant 8)
many people also flip flop the scale (think small numbers are better)

“Recently hanging out with gregarious friends; I’m comparatively less extroverted,”
Response: 6

“Does extroverted mean extraordinary?”
Response: 3

“I’m bored, choose middle”
Response: 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Whats the intelligence test example of measurement error?

A

Intelligence test (IQ)
True value: 100

Sources of measurement error:

Unusually stressful day (score will probably be lower than true value)

Oops, had too much coffee (score will probably be lower than true value)

Oops, had too little coffee (score will probably be lower than true value)

Had to guess on 4 items (any good IQ test would require that you have to guess on some questions otherwise we would have a ceiling effect

Guessed all 4 correct!! Score = 115

Guessed all 4 wrong!!! Score = 85

Guessed 2 correct, 2 wrong. Score = 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is estimation fundamentally limited? but what can we do?

A

We cannot remove all measurement error
Thus, we never obtain a person’s true value

Best we can do is estimate a person’s true value
While minimizing measurement error
While trying to measure only the intended psychological construct and nothing else

(our intelligence test shouldn’t correlate with cultural backround, first language, gender etc. If this correlates with things we don;t think they should, this reflects measurement error etc. )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

are measurement instruments created equal?

A

Measurement instruments are not created equal! We revise our instruments overtime

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do you need reliability before you get validity?

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is reliability?

A

What is Reliability: How consistent is a measurement tool?

Is my score similar each time?
Try the color test: www.colorquiz.com

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

If we can assume that the true value isn’t changing what should happen?

A

if we can assume that the true value isnt changing, your score should be similar overtime.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is validity?

A

Validity: Does the tool measure the psychological construct it claims to?

Is my score representative of something meaningful?
Color test: Does my color score reflect my personality?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 2 measruement goals?

A
  1. be consistent (reliability)
  2. hit the target (validity)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the visual analogy of the true value

A

visual metaphor of a dart board. lightening bolt is a measurement occasion.

tests can be reliable but they are not hitting in the right spot.

if our darts are hitting all over, we know its not at the true value most of the time. We need consistency/ It needs to land in the same place multiple times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What questions do we need to ask ourselves about mesurement error?

A

this is derived from the theories we have about the characteristics and the true value. After we have an idea of what the true valie should be, we need to make sure that the scale allows people to assess that true value

What is the nature of the true value?
Does my measurement instrument allow people to express their true value?
If not, estimates will appear to fluctuate even though true value is stable

19
Q

How does myers briggs type inventory (16 personality types test) encourage measurement error?

A

this introduces error by pushing people in the middle to either of 2 sides like introverted or extraverted.

20
Q

Is the big five personality encouraging measurement error?

A

not so much. this derives theoretically from the idea that intra and extraversion is a mixture and identified by a general trend. It predicts that most people are in between the 2.

21
Q

on the dart board analogy how would we know we are being reliable and valid?

A

Reliable: Hitting the same place
Valid: Hitting the target

22
Q

on the dart board analogy how would we know we are being reliable and not valid?

A

Reliable: Hitting the same place
Invalid: Off-target

Estimates are biased

23
Q

on the dart board analogy how would we know we are being not reliable and not valid?

A

Unreliable: Not hitting the same place

Low validity: No bias, but rarely on target

23
Q

What is internal consistency?

A
  1. Internal consistency: Do the items in the measure correlate with each other?

Example: IQ test
if you fail the easier question you should fail the harder question

Question 3/ What is the square root of pi?
this is probably measuring crystallized knowledge not intelligence. It probably won’t correlate with performance on pattern matching

Question 4: what is your favourite colour?
if this doesn’t correlate with the patter nshown from the other masures, we would say that these are measuring something different because they aren’t in the same cluster together.

23
Q

what is Test-retest reliability

A

How consistent is the measure over time?

this is the raven’s matrices
we expect there would be changes in the true value of this goes down

24
Q

What is Interrater reliability

A

Do observers agree on ratings?

observers are like items on a scale. We want them to all agree on what they see.

the manual will bias the results. It will increase the consistency of scores so the process of choosing what the peopel are paying attention to is the validity process/

the measurement process often works backwards, we start out by assuming that caregivers are attached to ttheir caregivers in different ways

25
Q

How is reliability often expressed?

A

Reliability is often expressed in terms of correlation Pearson r correlation coefficient (or reliability coefficient)

26
Q

What are the ways that you can measure internal consistency?

A

Correlate estimates between items
Method 1: Split-half procedure

     split measruement instrument into 2 equal parts. 
      Will get the score from one half and comare it to 
       the score on the other. Ex; could take all even 
       items, then calculate score if you only did the 
       odd items, the even and od scores should be 
       similar if they are measuring the same tihings. 

Method 2: Cronbach’s alpha procedure
𝛂 or 𝛚
get each of the pairs (all different options) of items on the measurement instrument and correlate performance on each of these pairs. Then you take the average of all of those correlations. The nice things about this procedure means each items is important in establishing reliability of scle. Als otells you which items are most problematic because it will have low correlations with all of the other items.

26
Q

What is the benchmark for intenral consistency using cronbach’s alpha?

A

Benchmark internal consistency:

r = .80 is a good starting point
when it comes to internal consistency we want to see a positive r of at least .80.

27
Q

How do you measure test-retest reliability?

A

Test-retest reliability
Correlate estimates between measurement occasions
Participants complete same measure at Time 1 and Time 2
(Repeated-measures design)

repeated measures design where time is the independent variable.

28
Q

What are moderators of test-retest reliability?

A

Moderators of test-retest reliability:
How much time between time 1 and time 2?
How stable is the construct?

IQ vs. attitudes
we think IQ is stable but attitudes towards things are more changeable.

28
Q

What is the benchmark for test-retest reliability?

A

Benchmark test-retest reliability: § r = .80 is a good starting point
Assuming little time between measurements
Assuming construct should be relatively stable

29
Q

how do you measure interrater reliability?

A

Interrater reliability
Correlate estimates between observers
Observers rate same behavior
Kappa = 𝛋
kappa represents correlations to interrater reliability. it is almost always lower than we would like.

Often lower than is desirable!

30
Q

Do we have benchmarks for interrater reliability?

A

no

31
Q

How do you increase interrater reliability?

A

Increasing interrater reliability:
Generate concrete, easily observable guidelines Initiate dialogue between observers
Practice with feedback
compare your ratings with the expected ratigns with samples.

31
Q

ho would you look at reliability with the ravens matrices example?

A

Internal consistency
Split half reliability – odd & even items correlated, r = .93-.96

Cronbach’s Alpha – average item correlations, r = .88-.90

Test-retest reliability
2 days, r = .97
4 weeks, r = .87
79 years, r = .54
this is not concerning because we expect the correlation to be less strong overtime

Interrater reliability Not applicable

32
Q

What is construct validity?

A

An evaluation of whether the measurement instrument quantifies the psychological construct

32
Q

What are the 7 types of validity?

A

Face valdiity
content validity
predicitve validity
concurrent validity
convergent validity
discriminant validity
reactivity

33
Q

What is face valdiity?

A

Face validity: Does the measure appear to assess the psychological construct?

Do Raven’s matrices seem like intelligence?
when I look at the measurement, does it look like it is capturing the construct? everyone can have their own judgement
this is a good place to start but it is possible that the person is wrong about the validity
researchers may intentionally obscure the face validity to prevent people from knowing what is being measured.

34
Q

What is content validity?

A

Content validity: Does the measure appear to assess ‘the whole construct and nothing but the construct’?

Are Raven’s comprehensive?
is it contaminated by other variables like language background?

35
Q

what is Predictive validity?

A

Predictive validity: Does the measure correlate with future
behaviors relevant to the construct?
- If ‘no,’ why measure the construct?
- Raven’s correlates with job performance, r = .3-.5

if you have someone take the ravens test and ask their boss to rate their job performance, you will see that there is a correlation. The correlation gets higher for jobs that require more intellect. This shows predictive validity.

36
Q

What is concurrent validity?

A

Concurrent validity: Does the measure correlate with current behavior?
in the same test session or on the same day.

36
Q

What is convergent validity?

A

Convergent validity: Does the measure correlate with other defensible operationalizations of the construct?

Raven’s matrices correlate with Weschler’s test, r = .85

if you score highly on one you should score highly on the other if they are both measuring the same construct.

37
Q

what is Discriminant validity?

A

Discriminant validity: Does the measure not correlate with
unrelated constructs?
IQ scores do not correlate with favorite colors, r = .05

it shouldn’t correlate with unrelated things.

38
Q

What is reactivity?

A

Reactivity: Does awareness of the construct change the psychological construct that is measured?
If so, the measure cannot be valid