Lecture 24 - Critical Thinking about Psychological Constructs Flashcards

1
Q

What do LeBel & Peters base their criticisms on?

A

Bem’s experiments.
Bem ran experiments which claimed evidence for psi, which is the anomalous retroactive influence of future events on an individuals current behaviour.
Also known as pre-cognition, aka the ability for your actions to be influenced by future events.
He discovered this by having participants do a computer program where they pick which set of curtains to open, and behind some are erotic scenes, and others neutral.
He claimed that since people avoided the curtains more often than chance (like 51% or something), then psi must exist.

This is a lot of text, but I put it here just so you know what experiment they discuss throughout the paper.
It might be mentioned in the exam, idk, just remember the general gist of it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s problematic about Bem’s studies?

A

The main issue is that Bem followed the modal research practice (MRP) perfectly, but got these results. So he didn’t do bad science, the guide for science is bad.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can research findings be related to?

A

Research findings can be related to

  • Theory relevant beliefs (TRBs)
    • Theoretical mechanisms that produce observable behaviour
  • Method relevant beliefs (MRBs)
    • Procedures with which we produce and analyse data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Interpretation Bias?

A

The tendency to interpret the failure to confirm predicted outcomes in terms of MRB, but confirmed predictions in terms of TRB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why is this particularly problematic in psychology according to LeBel & Peters?

What perpetuates this?

A
  • MRB are too peripheral
  • TRB are too central

Conservatism in science increases this issue, because favouring minimal revision to the knowledge system perpetuates the two issues above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does it mean to say “In an ideal world, research procedures and measuring instruments are unambiguously defined and validated”

A
  • Ideally the operationalisation of ‘implicit bias’ and ‘aggression’ or ‘attention’ should be so clear that this is done in the same way in many studies.
    Unexpected results from such measurements couldn’t be dismissed as a failed pilot study.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the result of having more central MRBs?

A

We are forced to be conservative in the interpretation of results

  • Meaning preference for the interpretation that keeps established knowledge structures intact as much as possible.
  • This constrains the field of alternative explanations and so makes empirical tests more diagnostic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What do LeBel & Peters mean when they say TRBs are often too central to psychology?

A

By this they mean that empirical predictions are often indistinguishable from very general assumptions about human behaviour.

  • The more general these assumptions are, the less stringent the theory can be tested.

Look at figure 1 and flashcard 11 for example/more info.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the degree of corroboration?

A

It is the degree of relative confidence assigned to one hypothesis over another based on test performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the degree of corroboration depend on?

A

It depends on how strict your test is/to what extent you expose the theory to falsification

However, the stringency of our tests is debatable, due to the aforementioned reliance on conceptual replication as well as the reliance on the Null ritual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Look at figure 1, Whats wrong and how can it be fixed?

A

It shows how the red circles can’t really be disproved because they’re within a large general assumption
The solution to this would be weakening the logical status (aka moving the red bubble outside the blue one), making the theoretical beliefs easier to test and reject, reducing interpretation bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What methodological flaws does Bem’s experiments reveal in science?

A
  • Overemphasis on conceptual replication
  • Problems with the way NHST is implemented.
  • Insufficient attention to verifying the integrity of measurement instruments and experimental procedures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does overemphasis on conceptual replication affect the problem?

A

Continuous theoretical advancement prioritizes conceptual over close replication, making findings less reliable
Failure to produce significant results constitute failed pilot studies that end up in the file drawer
Instead of examining why these results occurred and how the theory relates, we just say the procedure was bad and do something else

Central TRBs, and peripheral MRBs once again folks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does problematic implementation of NHST worsen the issue?

A

Its a straw target fallacy
Setting up a H0 as zero difference/association is a rather weak test of a theory
There is a fairly good chance that some differences will always be found
Thus, given enough power, finding a significant difference is virtually guaranteed when H0 = zero difference.

So, use of NHST can add ‘support’ to bad centralised theories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Insufficient attention to verifying the integrity of measurement instruments and experimental procedures

A

We should put more effort and time into the verification and validation of our measures.
Particularly,

- Reliability of DV and personality measurements
- The use of ad-hoc instead of validated measures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What attribute of psychological processes is troublesome for psych measurements?

A

Psychological processes are context sensitive which makes validation of psychological measurement very difficult.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What do L&P recommend you to do improve science? (Each one refers to the respective error above)

A
  1. Close replication
    • Emphasize exact replications to validate findings and reduce type I errors, especially in early research stages
  2. Robust hypothesis testing
    • Adopt Bayesian analysis to incorporate prior knowledge and reduce ambiguity.
  3. Methodological rigor
    • Routinely verify the reliability and validity of instruments
    • Separating pilot testing from substantive hypothesis testing

Just remember the main points, the subpoints are there to give reasons. (Tell me if you’d like me to split these points into 3 flashcards)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are psychological measurements designed to do?

A

Measurement procedures are designed to enable inferences about unobserved psychological processes
Sometimes measurement is considered to be more direct (survey), sometimes more indirect (reaction time measurement)

Collectively they’re known as psychometrics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What makes the psychometric approach useful?

A

Its statistical nature.
- The contents of a question doesn’t even have to be related to the construct, as long as the question consistently shows a high degree of statistical association with the construct, then its good.
- (e.g. if all the ‘good’ people give one answer and all the ‘bad’ people give a different answer, then it helps you identify this).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

So why is ignoring this statistical nature bad?

A

If you focus on what the semantic content of the question is, you end up reducing the resulting data to the status of what some people say about what they think they think.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Now, what are some issues with what people say about what they think they think.
AKA, what are limitations of self-report research?

A

Some limitations (not all, but the ones mentioned) are

  • Assumptions relating to the formation and accessibility of thoughts
  • The willingness of participants to share their thoughts
  • The degree to which such thoughts truly represent the discrete constructs being investigated.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Ok, we shall now take a closer look at 3 such measurement procedures

A
  • Survey responses
  • Reaction time: IAT
  • Blood oxygenation: fMRI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Ok so Roeland had some fun fun talking about an example of a dodgy survey, I’ll give a brief description ab it (shock, it involves the dutch being a tad racist)

A

Dutch politician Geert Wilders reacted to a survey on the Dutch population’s opinion ab Islam.
It made it seem like the Dutch have had enough of Islam, but the survey is designed in a way to persuade you to agree with the last question,
Basically, a survey that was done legitimately and lots of people did it, yet the way it was designed reveals some issues… (you can finally go to the next flashcard)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are factors of surveys that mess with it.

A
  • Surveys are inherently subjective
    • Your interpretation and phrasing might be different compared to your respondent’s
    • Language is inherently vague, and some questions and answers are more vague than others
  • Survey measurement is context sensitive
    • Social desirability
    • Test-retest reliability is rarely checked in general surveys
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

The systematic nature of research using surveys derives in large part from…

A

The psychometrics properties of the test

  • Test-retest reliability
  • Internal consistency
  • Predictive validity
  • Construct validity
26
Q

What is Conflation?

A

Using two measurement tools that measure two supposed discrete constructs, and finding a correlation.
However in reality the two measurement tools measure the same construct.
An example of conflation is researchers studied children aged 6-8 and have found a positive correlation between exercise, measured by self-reported physical activity, and literacy and numeracy skills, measured by a literacy and numeracy test.

27
Q

Look at figure 2. Can you see why these two variables are connected? Are they conflated?
Remember that the self-reports are from 6-8 year olds

A

It’s actually conflated, a child who is good at counting and reading will be able to report their physical activity better. So they are both measures of literacy and numeracy skills, because if the kid can’t keep track and report their physical activity, they’ll put a lower number (which will then correlate with the low test score).

Figure 3 shows this relationship, they both measure one thing.

If you’d like more examples, watch 1:11:06→1:14:55. One is conflating positive emotion with life satisfaction, no shit they’re correlated.

28
Q

What are some issues with conflation in pyschology?

A

Conflation is very common in psychology. It’s an issue because it implies that the two variables are separate sentiments that have co-occurred in a way that reveals something we did not know before.

29
Q

Whats one issue with having people report ‘discrete’ events?

A

It’s an issue particularly with anecdotal data, because we can’t verify whether what the person calls discrete is actually discrete.
So someone might report ‘discrete’ events and you base correlations off of that, but you’ve just conflated, ya fool.

30
Q

Summary of issues with taking survey measurements at face value

A
  • Context sensitivity
    • Did the researcher create the response or does the response reflect an actual attitude
  • Conflation
    • The fact that we can ask different questions does not mean that we’re actually measuring the same thing
    • The fact that A & B are correlated might just reflect that we’re measuring the same thing twice.
31
Q

Briefly back to self-reports for one flashcard
Whats an issue when asking about controversial topics

A

Self-reported attitude on segregation usually doesn’t correlate with behavioural measures, both are intended to measure racism/prejudice, however, people don’t want to appear racist so the self-report doesn’t correlate

32
Q

So, How do you measure difficult problems given explicit self-reports are unreliable?

A

✨The Implicit association test (IAT)✨

33
Q

Summary of the IAT

A

Designed to bypass socially desirable answers
Popular in research
A basis for popular explanations of persisting bias and discrimination
IAT scores reflect differences in average reaction times

You don’t need to memorise this, just a quick refresher :)

34
Q

What does the usefulness of any implicit measure in better understanding or even changing prejudiced behaviour depend on?

A
  • Predictive validity
  • Construct validity
  • Test-retest reliability

These checks are essential, especially given that it’s hard to argue the face validity of the IAT.

35
Q

Does the IAT have predictive validity?

A

The IAT has slight predictive validity.
For example
“Automatic white preference” would predict different things (such as more favourable ratings of white (vs black) job applicants
IAT scores are weak to moderate predictors of discriminatory behaviour

36
Q

How much do IAT scores predict in lab-based measures of discrimination?

A

2-5% of explained variance in lab based measures of discrimination

HOWEVAH, Lab based measures of discrimination, included in these meta-analyses, are often far removed from realistic forms of discrimination

37
Q

Construct validity and the IAT

A

It aint really got it, it don’t have a consistent relationship with measures a la prejudice

38
Q

How should we interpret the absence of a consistent relationship with explicit measures of prejudice?

A
  • Does the IAT measure an implicit part of the same attitude, or something else altogether?
  • Does the IAT reflect culturally transmitted knowledge/associations
  • Endorsement of prejudice and stereotypes.

If an IAT measures implicit attitudes, why do most men not exhibit implicit sexism?

39
Q

Test-retest reliablity and the IAT

A

IAT tests reliably produce IAT effects in general
IAT scores turn out to be a poor predictor of future scores by the same individuals on the same test.

So do IAT scores reflect individual implicit bias? Not really, the IAT is ill suited for diagnostic purposes

40
Q

Social cognition models

A

The Hughes chapter talks about the theory of planned behaviour for quite a while, and eventually concludes that it actually is conflation. I won’t get into it on the flashcards, but if you so desire, you may request a splendid explanation by moi.

41
Q

What is an arbitrary metric?

A

A metric is arbitrary when
- It is not known…
- where a given score locates an individual on the underlying psychological dimension
- or how a one-unit change on the observed score reflects the magnitude of change on the underlying dimension.
- An individuals observed score provides only an indirect assessment of their position on the unobserved psychological construct.

42
Q

If we aren’t familiar with the metric, what happens?

A

We can’t understand the scores given (duh)
We have to know how to convert an observed measurement score to an unobserved construct score.
In the absence of links to external references, we can’t understand how to convert observed scores.

Is this an issue? find out soon!!

43
Q

In theory testing (understanding how two variables are related), do we need to have nonarbitrary metric?

A

It doesn’t matter if the metric is arbitrary. Researchers just want to know whether the scores are consistent with what the theory proposes. Whether the scores reflect small or big differences doesn’t matter.

44
Q

So in what field do we want nonarbitrary metrics?

A
  • Diagnostics/assessment: What is the standing of this individual/this group on the construct dimension of interest?
    • To find this, they perform research that ties specific scores on a metric to specific events that are meaningful in the life of the respondents. (giving you external referents to understand the scale!!!! YAHOOOO)
45
Q

What are the two suspect sources of information that researchers sometimes try to infer nonarbitrary meaning from?

A
  1. Meter reading
  2. Norming
46
Q

What is Meter reading?

A

Meter reading: Saying a score on measurement level is exactly the same on the construct level
Researchers simply use the score on the observed metric to infer location on the underlying dimension.
E.g. someone with high unobserved construct scores should score highly, and low scores low.

A big issue with this, is what if your theory is wrong, what if your measurement tool isn’t correct? hmmmm?? think critically young ones, soon you will be like Roeland (smart, not bald)

47
Q

Now back to our friend, how is meter reading an issue on the IAT?

A

43% of people who did the IAT were told they had a “strong automatic preference for White people”.
Based on what? They based it on a score of 0 reflecting neutral behaviour, and that positive scores reflect White preference

48
Q

Whats the issue with assuming a score of zero reflecting behavioural neutrality?

A

It’s hard to properly know what a score of 0 reflects in terms of behaviour.
The zero point doesn’t necessarily reflect behavioural neutrality.

Therefore looking at someone’s IAT score of 0.5 and thinking ‘oh they’re a bit racist’ could be wrong. In reality, an IAT score of ~0.5 reflects neutral behaviour towards race.

49
Q

So, a summary, and a note.

A

So overall, don’t do meter reading, you have to try and understand what actually reflects neutral behaviour (e.g. a score of 0.5) rather than assuming 0=neutral.

There’s more info about this, but it’ll be a pain in the ass to explain via flashcard, so I’ll do it in person (no requesting this explanation, you’re getting it :] )

50
Q

What is Norming?

A

si, the statistics one

Raw scores are transformed into standardized scores or percentiles on the basis of normative data and interpretations are imposed on the basis of this new metric.

51
Q

How is norming an issue on the IAT?
How can we learn from this?

A

Standardizing scores does nothing to make the score less arbitrary, you just have standardized scores, what does a standard score actually reflect? If your test is biased, then your standard score isn’t the midpoint of the unobserved construct.

52
Q

Who didn’t learn from the IAT?

A

THE LTE PAPER
Look at figure 4

Here they say anything above 3 is an association, but is that actually true? Does a score of 3 reflect neutral association? You can link this to biases on how people respond to tests.

53
Q

Blood oxygenation: fMRI
A summary

A

Roeland went fast over this, but it showed a paper that said blood oxygenation can be measured and it correlates with depression, so we can measure depression with fMRI (wow science)

It falls prey to Neuro-heuristics

54
Q

What are Neuro-Heuristics?

A

The Heuristic of equating a brain measurement to a psychological construct and saying its a better, more direct measurement of a psychological state.

55
Q

The next few flashcards will be about the marital satisfaction example, so refer to figure 5 (por favor)

A

The paper talks a lot about it, I’ll give the main points, but I can also yap ab it in person if you so please

56
Q

Can you infer true extremity simply based on the extremity of an observable metric?

A

NUH UH

Because if you use scale X, person A is extremely unsatisfied, but in reality they are kinda unsatisfied.

With scale Z, a person’s observed score on this scale, by and of itself, does not allow one to make a formal inference about the location of the individual on the true underlying dimension. (Ya can’t say shit based on Z alone)

57
Q

What should we be cautious with regards to magnitude of change?

A

You should be cautious not to make inferences regarding the magnitude of change on a true psychological dimension based simply on the magnitude of observed change on the observed metric.

e.g. If person B move a lil to the left, under scale X, they’ve decreased from a 3 to a 2, (whoa buddy do you hate your partner), but they will stay at a 4 on scale Z, so ya can’t say shit.

58
Q

To really understand why arbitrary metrics are an issue, look at figure 6

A

This is like reality
We wouldn’t have the first line (so i made it go poof) because we don’t know the true values. How would you decide which scale is best then? Would you be comfortable doing so based on one scale alone?

59
Q

When is important to have non-arbitrary metrics?

A
  • At the individual level
    • Making a psychological diagnosis on the basis of an individual’s observed score on a psychological inventory
  • At the group level
    • Evaluating real-world importance of psych interventions for groups of people.
60
Q

Whats an issue with using physical metrics for quantifying psychological constructs?

A

Different physical metrics such as time (e.g. reaction time in milliseconds) or behaviour counts (number of times you smoked marijuana this month) are good metrics when quantifying a physical reality, but become arbitrary when used to quantify psychological constructs.

E.g. if i count how many times you and I smoke cigarettes, I can say that you smoke x times more than me. If I say oh smoking reflects stress, I can’t say that you are x times more stressed than me based on how many cigarettes you smoke, the metric loses meaning.