Exam #2 (1 of 2) Flashcards
LABORATORY EXPERIMENTS
LABORATORY EXPERIMENTS
Laboratory experiments: Advantages
Advantages of laboratory experiments:
- Establish causation
- Experimental control (lot of control over environment)
- Usually high internal validity
Laboratory experiments: Disadvantages
Disadvantages of laboratory experiments:
- Many IVs of interest can’t be manipulated
- External validity is a concern (poor generalizability)
- Possibility of oversimplification (may end up missing many relevant variables)
Two main types of laboratory experiments
Two main types of laboratory experiments:
-
Impact experiment
* Participants = active participants in series of events. They react to these events as they occur
* Very involving
* Usually one person at a time -
Judgment experiment
* Participants recall, recognize, classify, etc. materials presented by experimenter
* Less involving, less investment
* Usually easier to do
* Can have multiple people at a time
- Researchers can convert impact experiments to judgment experiments, vice versa
- Impact or judgment? Depends on research Q’s
mundane realism
mundane realism = the extent to which events occuring in the research setting are likely to occur in the normal course of participants’ lives
- many laboratory experiments don’t have high mundane realism
- not the most important kind of realism
- example of laboratory experiment with high mundane realism: reading a newspaper article and answering questions about it
experimental realism
experimental realism = extent to which the experiment is involving to the participants and is taken seriously by them
- more important than mundane realism
- Impact studies –> usually high experimental realism
- Judgment studes –> usually lower experimental realism
- It takes researcher skill to create experimental realism
Psychological realism
Psychological realism = the extent to which the psychological processes that occur in an experiment are the same as psychological processes that occur in everyday life
- most important type of realism for increasing generalizability
Four Stages of Laboratory Experimentation
Four Stages of Laboratory Experimentation:
- Setting the stage for the experiment
- Constructing the IV
- Measuring the DV
- Planning the post-experimental follow-up
- Setting the stage
- Setting the stage:
- Researchers need to explain the research in a way that makes sense to participants
- May need a cover story = false rationale for research
Cover stories should:
Cover stories should:
* Be as simple as possible
* capture participants’ attention
* encompass all features of the experiment that satisfies curiosity so they don’t speculate on hypothesis
- (Cover stories tend to be more elaborate in impact experiments)
2. Constructing the IV
- Constructing the IV:
- How to operationalize the conceptual variable
- Pilot testing: test IV on small # of participants to see if it brings about intended state
- Manipulation check: extent to which an experimental treatment has intended effect on each participant
In short, pilot testing is about testing the overall design of the study, while manipulation checks are about confirming the effectiveness of specific experimental interventions.
What are experimenter effects
Experimenter effects = unintended influences that a researcher’s presence, behavior, or expectations can have on the results of a study
- These effects can alter participants’ responses or behaviors, affecting outcomes
What does SCC call experimenter effects?
SCC would call experimenter effects as an experimenter expectancies threat to construct validity
Avoiding experimenter effects
Avoiding experimenter effects:
- Experimenters can be kept naive to research hypothesis (especially research assistants)
- Experimenters could be kept unaware of condition assignment
- Multiple experimenters could be involved in each session, each with only limited (and different) information about the experiment
What is participant awareness bias?
Participant awareness bias = participants change behavior simply because they know they are being observed or are part of an experiment.
This self-consciousness can change responses or actions, affecting the study’s validity
What does SCC call participant awareness biases?
SCC would call participant awareness bias a reactivity to the experimental situation threat to construct validity
Avoiding participant awareness biases
Avoiding participant awareness biases:
- Can use deception or cover story
- Can use an “accident” or “whoops” manipulation
- Unrelated-experiments technique (e.g., IV manipulated in ‘first experiment,’ DV measured in ‘second experiment’)
Confederate = fake participant who performs a specific function of the study
3. Measuring the DV
3. Measuring the DV:
- Many ways to operationally define a conceptual variable
- Choice of DV depends mostly on research Q
.
Also depends on practical considerations: - Self-report
- Behavioral
- Behavioroid = measurement of behavioral intentions (what people say they plan on doing, like self-reports of self-care intentions)
- Physiological
What type of DV measurement is research about prejudice more likely to use?
Research about prejudice may rely more on behavioral measures given that people will lie about how they feel
What’s one way to hide that you’re doing a posttest?
E.g., We lost your pretest responses! Please complete again.
Ways to disguise the DV
Ways to disguise the DV:
- Embed DV within larger questionnaire (e.g., use distractor questions around central DV)
- Use unrelated-experiments technique
- Measure DV in setting that participants think is not part of the experiment
- Use the “whoops” technique (e.g., lost pretest)
- Use behavioral or physiological measures that would make it difficult for participants to alter their scores
4. Post-experimental follow-up
4. Post-experimental follow-up
- @ end of sessions, important to provide debrief
- Ensure participants are in good/healthy frame of mind
- Probe for suspicion
- Explain experimental procedures, including possible deceptions
- Learn participants’ thoughts about experiment
- Express appreciation for participants’ time and effort
MODERATORS AND MEDIATORS IN EXPERIMENTAL RESEARCH
MODERATORS AND MEDIATORS IN EXPERIMENTAL RESEARCH
What do moderators and mediators do in terms of our understanding?
Moderators and mediators help us to better understand causal effects (albeit in different ways)
What is a moderator?
A moderator is a variable that affects the strength or direction of the relationship between other variables in a study.
It essentially modfies how or to what extent these variables are related.
More facts about moderators
Moderators:
- A moderator qualifies the effect of an IV (or treatment) on a DV (or outcome)
- The moderator tells us when, for whom, or under what circumstances that causal effect occurs
- A moderator might alter the strength, direction, or existence of the effect
Moderator: qualifies | when | for whom | under what circumstances
What can moderators be?
Moderators can be:
- a variable manipulated by the researchers
- or a characteristic of participants that the researchers measure (usually a stable trait characteristic like Big-5 personality dimension)
Moderational Model (diagram)
How do moderators affect external validity?
Moderators can limit external validity. They add conditions to the effect of the IV on the DV.
What does Social Facilitation Theory relate to?
Social Facilitation Theory relates to the idea that the presence of others (IV) can have an impact on performance (DV)
Example: Task difficulty, presence of others, performance
Example: Task difficulty may moderate the effect that the presence of others has on performance (free throw example, pool example)
Mediators
Mediators:
- A mediator explains the causal effect of an IV (or treatment) on the DV (or outcome)
- The mediator tells us how, by what mechanism, or through what pathway the IV (or treatment) exerts its effect on the DV (or outcome)
Ask: What variables may be in between the IV and DV? What does the IV change in people that results in the DV?
Mediators: explains | how | by what mechanism | through what pathway
Mediation model example (social faciliation theory)
Complete mediation & partial mediation
Complete mediation occurs when the mediator fully explains the effect of the IV (or treatment) on the DV (or outcome)
Partial mediation occurs when the mediator partially explains the effect of the IV (or treatment) on the DV (or outcome)
What can partial mediators indicate?
- Partial mediators can indicate that other mediators exist but haven’t been accounted for in the model
- Can also point to measurement errors
Theory development (regarding moderators and mediators)
IOW, which results suggest the possible presence of moderators? Which results suggest the possible presence of mediators?
- Moderators often get considered when an unexpectedly weak or inconsistent finding emerges
- Mediators often get considered when a strong effect exists and we want to understand/explain it
When to measure moderators and mediators
- If a proposed moderator is being measured, the measurement should occur before the IV is manipulated
- A proposed mediator should be measured after the IV has been manipulated and before the DV is measured
How to describe moderators and mediators in correlational research?
- In correlational research, moderators qualify and mediators explain the RELATIONSHIP between the predictor variable and the outcome variable
Not the impact of the IV on the DV, but the relationship
What statistical testing is often used for moderators?
For moderators, we typically use ANOVA or regression.
Which one to use if at least one IV is continuous? –>Regression.
Either way, it’s the interaction term that tells us whether the proposed moderator is actually acting as a moderator
How do we test mediation models?
Previously, a lot of people used Baron and Kenny’s method for mediation.
More updated statistical analyses uses bootstrapping (with certain macros, like Hayes’ PROCESS macro)
What variables could not work as a mediator?
Variables that are relatively stable and that cannot change in response to an IV can’t really work as mediators
E.g., a lot of identity variables
STATISTICAL POWER
STATISTICAL POWER
What is a research hypothesis?
Research hypothesis = a specific and falsifiable relationship between two or more variables
- Should include the existence of a relationship AND the direction of the relationship
In this context, “falsifiable” basically means “testable”
null hypothesis
null hypothesis = specifies the pattern of data that would occur if the research hypothesis were false, if the data occurred simply by chance
Research goal and accepting/rejecting the null hypothesis
Research goal: determine whether the pattern of observed data can be explained by chance
If yes? We fail to reject the null
If no? We reject the null
* This means that the pattern seen in the data is different enough from what would be expected on the basis of chance that it makes sense to conclude that a real relationship between the variables exists and is responsible for the data
What’s something you should keep in mind about rejecting the null or failing to reject the null?
- Rejecting the null does not necessarily mean that the null hypothesis is false, only that it does not seem to account for the collected data
- Failing to reject the null does not necessarily mean that the null hypothesis is true, only that it cannot be rejected on the basis of the collected data
True State of Affairs (diagram–beta, alpha, Type 1 & 2, etc.)
Type 1 error = false positive, Type 2 error = false negative
What is alpha usually set to? What is beta usually set to? Why?
Alpha usually set to 0.05. Beta usually set to 0.20.
Why? We’re more concerned about committing Type 1 errors than Type 2 errors
Note: The study’s power is usually lower than we realize/expect (i.e., beta is usually higher than 0.20)
Type 1 Error
Type 1 Error = reject the null when the null is true
IOW, conclude we have found an effect when that effect does not actually exist
Type 1 error probability = alpha
Type1 error = False positive
Type 2 Error
Type 2 Error = fail to reject the null when the null is false
IOW, conclude we have not found an effect when that effect really is there
Type 2 error probabilty = beta
Type 2 errors more common when power is low
Power
Power = 1 - beta
- Power: the probability that the research will, on the basis of the observed data, be able to reject the null hypothesis, assuming that the null is actually false and should be rejected
- Power typically set to .80, or 80%
- So, Type 2 error rate of beta = .20
What does power partly depend on?
Power partly depends on:
* Type 1 error rate (alpha)
* Effect size
* Sensitivity of the study
* –Sample size (can use power analysis to determine, G Power)
* –Reliability, validity, and sensitivity of measures
* –Controlling extraneous variables
Mnemonic to remember 11 methods to increase power
Many Cats Love Eating Wholesome Peas, Munching Silently, Viewing Happy Rabbits
What are the 11 methods to increase power?
11 Methods to Increase Power:
- Use matching, stratifying, blocking
- Measure and correct for covariates
- Use larger sample sizes
- Use equal cell sample sizes
- Use a within-participants design
- Ensure that powerful statistical tests are used and their assumptions are met
- Improve measurement
- Increase the strength of the treatment
- Increase the variability of the treatment
- Use homogenous participants selected to be responsive to treatment
- Reduce random setting irrelevancies
Methods to increase power (1–6 of 11)
Methods to increase power:
- Use matching, stratifying, blocking (increasing equivalence between groups)
- Measure and correct for covariates
- Use larger samples sizes
- Use equal cell sample sizes
- Use a within-participants design
- Ensure that powerful statistical tests are used and their assumptions are met
Methods to increase power (7th)
What ways improve measurement?
7. Improve measurement
- define the construct well before creating/selecting items to measure it (this would avoid the inadequate explication of constructs threat (construct validity threat)
- Use reverse-scored items to counter acquiescence bias = participant tendency to agree with all statements
- increase measurement sensitivity (wider response scale, use state-oriented rather than trait-oriented measure)
- use measures appropriate for cultural background of all participants
- use measures that are reliable and valid
Reliability and Validity
Techniques for evaluating the relationship between measured variables and conceptual variables
Reliability
* How consistently does the measured variable capture the conceptual variable it’s intended to assess?
* The extent to which a measure is free from random error
Validity
* How accurately does the measured variable capture the conceptual variable it is intended to assess?
* The extent to which the measure is free from systematic error
Random & Systematic Error
Random error = chance fluctuations in measured variables
* Examples: Participant marks some answers incorrectly or misreads a question
Systematic error
* Scores on the measured variable are influenced by other conceptual variables that are not part of the conceptual variable of interest
`
Random and Systematic Error (diagram)
Test-retest reliability
Test-retest reliability = extent to which scores on the same measure, administered at two different times, correlate with each other
* Retesting is a potential problem
* Test-retest reliability better for trait measures than state measures
Equivalent-forms reliability
Equivalent-forms reliability = extent to which scores on similar, but not identical, measures–administered at two different times–correlate with each other
* higher the correlation, the greater the equivalent-forms reliability
* helps reduce (but not fully eliminate) potential problem of retesting effects
Internal consistency
Internal consistency = extent to which scores on the items of a scale correlate wich each other
Can assess internal consistency with:
* split-half reliability = correlating scores on one half of the items with scores on the other half of the items
* Cronbach’s coefficient alpha = estimates the average correlation among all the items on the scale
- +item-total correlation = statistic which measures how strongly an individual item correlates with the sum of the other questions
How do internal consistency measures work for trait vs. state measures?
Internal consistency measures work just as well for trait measures as they would for state measures
Face validity
Face validity = extent to which the measured variable appears to be an adequate measure of the conceptual variable
- “Does it seem like a good measure?”
- Good face validity can be a bad thing at times. Sometimes we want participants blind to what we’re assessing (e.g., prejudice)
Content validity
Content validity = extent to which the measured variable has adequately sampled from the potential domain of questions relevant to the conceptual variable of interest
- It’s about ensuring that the test fully represents the construct being assessed
How do we assess face validity and content validity?
We assess face validity and content validity not by statistics. Rather, expert opinion assesses both
Convergent validity
Convergent validity = extent to which a measured variable/test correlates with other measured variables/tests designed to assess the same conceptual variable
Discriminant validity
Discriminant validity = extent to which a measured variable/test does NOT correlate with measured variables/tests designed to assess other (related) conceptual variables
Criterion validity
Criterion validity = extent to which a self-report measure correlates with a behavioral measure of the same conceptual variable
* Concurent validity = when the behavioral measure is assessed at the same time as the self-report measure
* Predictive validity = when the behavioral measure is assessed at some point in the future
Known groups validity = extent to which groups’ responses confirm prediction (e.g., meditators are expected to have higher trait mindfulness)
Social desirability
Social desirability = tendency of participants to answer questions in a manner that will be viewed favorably by others
* Social desirability may produce systematic error and be a threat to the validity of a measure
* May be useful to administer a social desirability measure to correlate with the scale as a whole, or with individual items on the scale
* Examples: Marlow-Crowne = popular social desirability measure
Methods to increase power (8–11)
8. Increase the strength of the treatment
#9. Increase the variability of the treatment
#10. Use homogenous participants selected
#11. Reduce random setting irrelevancies
8. Increase the strength of the treatment
8. Increase the strength of the treatment:
- Emphasize/Repeat instructions
- Add elements to manipulation
- Pilot testing and/or manipulation checks
- Increase experimental realism
- Try to prevent or minimize: partial treatment implementation, crossover and treatment diffusion, attrition and differential attrition
Efficacy study vs. Effectiveness study
Efficacy study:
* Full implementation of the treatment is desired
* Goal is to determine whether the treatment has a meaningful effect under optimal conditions
Effectiveness study
* Full implementation of the treatment may not be desired
* Goal is to determine what effects the treatment has under the conditions of ordinary use
Example: medications administered at same time every day (optimal conditions) vs. imprecise administration (ordinary use)
Methods to Increase Power
- Increase the variability of the treatment
9. Increase the variability of the treatment:
- Increase the difference between the levels of the IV
- –> helps rule out restriction of range (statistical conclusion validity threat)
- (we don’t want levels to be too close together, we want them to be different enough to detect differences)
Methods to increase power:
- Use homogenous participants selected to be responsive to treatment
10. Use homogenous participants selected to be responsive to treatment
- Helps rule out the heterogeneity of respondents threat to statistical conclusion validity
- May reduce external validity (findings may not apply to other people with different characteristics)
Methods to increase power:
#11. Reduce random setting irrelevancies
11. Reduce random setting irrelevancies
- Experimenter exercises experimental control to keep everything constant except for IV
- Standardization of conditions
- –can use a script or protocol that contains all the information about what the experimenter will say/do during session
- –can use automated device to present instructions, manipulations, even measure responses