Lecture 24 - Critical Thinking about Psychological Constructs Flashcards
What do LeBel & Peters base their criticisms on?
Bem’s experiments.
Bem ran experiments which claimed evidence for psi, which is the anomalous retroactive influence of future events on an individuals current behaviour.
Also known as pre-cognition, aka the ability for your actions to be influenced by future events.
He discovered this by having participants do a computer program where they pick which set of curtains to open, and behind some are erotic scenes, and others neutral.
He claimed that since people avoided the curtains more often than chance (like 51% or something), then psi must exist.
This is a lot of text, but I put it here just so you know what experiment they discuss throughout the paper.
It might be mentioned in the exam, idk, just remember the general gist of it
What’s problematic about Bem’s studies?
The main issue is that Bem followed the modal research practice (MRP) perfectly, but got these results. So he didn’t do bad science, the guide for science is bad.
What can research findings be related to?
Research findings can be related to
- Theory relevant beliefs (TRBs)
- Theoretical mechanisms that produce observable behaviour
- Method relevant beliefs (MRBs)
- Procedures with which we produce and analyse data
What is the Interpretation Bias?
The tendency to interpret the failure to confirm predicted outcomes in terms of MRB, but confirmed predictions in terms of TRB
Why is this particularly problematic in psychology according to LeBel & Peters?
What perpetuates this?
- MRB are too peripheral
- TRB are too central
Conservatism in science increases this issue, because favouring minimal revision to the knowledge system perpetuates the two issues above.
What does it mean to say “In an ideal world, research procedures and measuring instruments are unambiguously defined and validated”
- Ideally the operationalisation of ‘implicit bias’ and ‘aggression’ or ‘attention’ should be so clear that this is done in the same way in many studies.
Unexpected results from such measurements couldn’t be dismissed as a failed pilot study.
What is the result of having more central MRBs?
We are forced to be conservative in the interpretation of results
- Meaning preference for the interpretation that keeps established knowledge structures intact as much as possible.
- This constrains the field of alternative explanations and so makes empirical tests more diagnostic
What do LeBel & Peters mean when they say TRBs are often too central to psychology?
By this they mean that empirical predictions are often indistinguishable from very general assumptions about human behaviour.
- The more general these assumptions are, the less stringent the theory can be tested.
Look at figure 1 and flashcard 11 for example/more info.
What is the degree of corroboration?
It is the degree of relative confidence assigned to one hypothesis over another based on test performance
What does the degree of corroboration depend on?
It depends on how strict your test is/to what extent you expose the theory to falsification
However, the stringency of our tests is debatable, due to the aforementioned reliance on conceptual replication as well as the reliance on the Null ritual
Look at figure 1, Whats wrong and how can it be fixed?
It shows how the red circles can’t really be disproved because they’re within a large general assumption
The solution to this would be weakening the logical status (aka moving the red bubble outside the blue one), making the theoretical beliefs easier to test and reject, reducing interpretation bias.
What methodological flaws does Bem’s experiments reveal in science?
- Overemphasis on conceptual replication
- Problems with the way NHST is implemented.
- Insufficient attention to verifying the integrity of measurement instruments and experimental procedures
How does overemphasis on conceptual replication affect the problem?
Continuous theoretical advancement prioritizes conceptual over close replication, making findings less reliable
Failure to produce significant results constitute failed pilot studies that end up in the file drawer
Instead of examining why these results occurred and how the theory relates, we just say the procedure was bad and do something else
Central TRBs, and peripheral MRBs once again folks
How does problematic implementation of NHST worsen the issue?
Its a straw target fallacy
Setting up a H0 as zero difference/association is a rather weak test of a theory
There is a fairly good chance that some differences will always be found
Thus, given enough power, finding a significant difference is virtually guaranteed when H0 = zero difference.
So, use of NHST can add ‘support’ to bad centralised theories
Insufficient attention to verifying the integrity of measurement instruments and experimental procedures
We should put more effort and time into the verification and validation of our measures.
Particularly,
- Reliability of DV and personality measurements - The use of ad-hoc instead of validated measures
What attribute of psychological processes is troublesome for psych measurements?
Psychological processes are context sensitive which makes validation of psychological measurement very difficult.
What do L&P recommend you to do improve science? (Each one refers to the respective error above)
- Close replication
- Emphasize exact replications to validate findings and reduce type I errors, especially in early research stages
- Robust hypothesis testing
- Adopt Bayesian analysis to incorporate prior knowledge and reduce ambiguity.
- Methodological rigor
- Routinely verify the reliability and validity of instruments
- Separating pilot testing from substantive hypothesis testing
Just remember the main points, the subpoints are there to give reasons. (Tell me if you’d like me to split these points into 3 flashcards)
What are psychological measurements designed to do?
Measurement procedures are designed to enable inferences about unobserved psychological processes
Sometimes measurement is considered to be more direct (survey), sometimes more indirect (reaction time measurement)
Collectively they’re known as psychometrics
What makes the psychometric approach useful?
Its statistical nature.
- The contents of a question doesn’t even have to be related to the construct, as long as the question consistently shows a high degree of statistical association with the construct, then its good.
- (e.g. if all the ‘good’ people give one answer and all the ‘bad’ people give a different answer, then it helps you identify this).
So why is ignoring this statistical nature bad?
If you focus on what the semantic content of the question is, you end up reducing the resulting data to the status of what some people say about what they think they think.
Now, what are some issues with what people say about what they think they think.
AKA, what are limitations of self-report research?
Some limitations (not all, but the ones mentioned) are
- Assumptions relating to the formation and accessibility of thoughts
- The willingness of participants to share their thoughts
- The degree to which such thoughts truly represent the discrete constructs being investigated.
Ok, we shall now take a closer look at 3 such measurement procedures
- Survey responses
- Reaction time: IAT
- Blood oxygenation: fMRI
Ok so Roeland had some fun fun talking about an example of a dodgy survey, I’ll give a brief description ab it (shock, it involves the dutch being a tad racist)
Dutch politician Geert Wilders reacted to a survey on the Dutch population’s opinion ab Islam.
It made it seem like the Dutch have had enough of Islam, but the survey is designed in a way to persuade you to agree with the last question,
Basically, a survey that was done legitimately and lots of people did it, yet the way it was designed reveals some issues… (you can finally go to the next flashcard)
What are factors of surveys that mess with it.
- Surveys are inherently subjective
- Your interpretation and phrasing might be different compared to your respondent’s
- Language is inherently vague, and some questions and answers are more vague than others
- Survey measurement is context sensitive
- Social desirability
- Test-retest reliability is rarely checked in general surveys