Exam 1: Experimental Designs Flashcards
Treatments
Levels of the independent variable that you control
Denotations of variables/designs:
X: independent variable
O: dependent variable
R: random assignment
What is an independent variable
One or more factor that you manipulate or control
True Experiment
- IV AND participants are randomly assigned (R)
* Strong evidence for causal conclusions
Quasi-Experiment
• IV and participants are not randomly assigned
• Weaker evidence for causal conclusions
Ex: degrees of HL, age of people
- Manipulate the IV, but can’t/don’t randomly assign participants to groups or conditions (e.g., intact groups)
- More susceptible to threats to internal and external validity- weaker evidence for cause-effect conclusions
Threats to internal validity
Factorial Design
- May include both true and quasi-experimental components
* Consider a true experiment if even one IV meets criteria, but strong causal evidence only for the true manipulation
Posttest Only
Randomized Treatment Groups
• No-Treatment Control R X O R O • Alternative Treatment Control R X1 O R X2 O
• What are other possibilities for post-test only designs? Why would (or wouldn’t) you use this design?
Part of the random assignment means that differences between age groups should be naturally “washed out”… education level, sex, etc. but their may actually be differences between these sub-groups- you can’t always do pre-testing based on the situations
Pretest-Posttest
Randomized Control Group Design
• Also known as a mixed model randomly assign, make a measurement, apply treatment and re-measure • Within-subjects • Between-subjects R O X O R O O
Why would (or wouldn’t) you use this design? learning affects from repeated exposure, if learning effect is big in the control this may swamp the effects of the treatment
Solomon Randomized Four-Group Design
need double the participants and the time
Pretest-Posttest
R . O X O
R . O . O
Posttest Only
R . X O
R . O
Switching Replications Design
R\_\_O\_\_X\_\_O\_\_\_\_O R\_\_O\_\_\_\_\_O_X\_\_O Similar to cross over study design each person is their own control we hope our interventions/therapies carry over; can’t be sure the no treatment epochs are the same. Differences in exposure for the groups.
Factorial
• Multiple IVs examined in one design
• Examine main effects and interactions
The # of #s indicates how many IVs there are
The # itself tells you have many levels of that IV there are
For example: 2 X 3
group (younger vs. older) x treatment type (new1, new2, standard)
a 2X2X2 has 8 conditions
dependency- if the effect of one IV is determined by the other HAs work well when the patient is young but not old
Nonequivalent
Control Group Design
• Compare intact groups
- N (or omitted) denotes nonrandom assignment
• Typically pretest-posttest NOXO
NOO
• What are some other options? What design option might strengthen cause-effect conclusions?
if you found that the treatment is really effective, what do you tell the people that you denied treatment?
Double Pretest
N O O X O
N O O O
Repeated Measures
• AKA within-subjects design
• Two or more measures from the same individual
• May be experimental (e.g., group) or non-experimental (e.g., time)
• Measurements may either occur within a particular session (e.g., task conditions) or across multiple sessions
X1 O X2 O
• Why is this a quasi-experimental design? What are the benefits/limitations? not random b/c everyone gets everything
Counterbalancing
• Design to avoid order effects
• Randomly assign people to a given order
R X1 O X2 O
R X2 O X1 O
Single-Subject Design (concepts)
• AKA single case
• Not necessarily conducted on a single participant, but the data are presented in terms of individuals rather than groups
Single subject does not = case study
• Repeated measures: many measures of the target behavior taken over many sessions
• Lacks random assignment
need a lot of samples to know whats going on
Single-Subject Design Notations
Notation
• A = baseline (no treatment)
• B = first treatment
• C (etc.) = second treatment
• Simplest A-B sequence is weak, seldom used
• Many other explanations for changes from pre to posttest
• Might be used as part of documentation
Treatment Withdrawal Design
• A1-B-A2
• Also called reversal design
On-Off see if there’s an affect
stronger - if i stop providing the therapy you should go back to baseline
* stronger causal claims if you can show changes when treatment is both added and taken away
Treatment Replication - withdrawal
• A1-B1-A2-B2
• What if the person doesn’t return to baseline?
if A2 doesn’t return totally to baseline, then B2 should be even better
Multiple Treatment
• A1-B1-A2-C-A3
• Draw conclusions about B, but note that C always
appears after B (serial order confound)
• How might you deal with this if you wanted to draw conclusions about C?
get another person and change the order
Multiple Baselines
• Across Behaviors
• Observe 2+ behaviors
-One you want to treat first
-One you will treat
differently
• e.g., lists that test perception/ production of speech sounds
• Similar in difficulty, but not overlap in content
• What sorts of treatments is this design good for?
• Across Participants -Balance treatment orders across more than one individual -Deals with carryover effects -Example here used with a treatment replication design
Levels of Evidence
- Strength of the evidence
- How well can you draw causal conclusions?
- True experiments are sometimes called RCTs
There is not one agreed on hierarchy.
Background information; case-control/reports; cohort studies; RCTs; CATS; systematic reviews
what is depth in terms of levels of evidence
amount of converging evidence
internal validity
History, maturation, statistical regression, instrumentation, selection, mortality
- Extent to which conclusions about cause-effect relationships are accurate
- Might there be other explanations (covert variables) for the patterns you observed?
History
Threat to internal validity
• Outside events influence participants in the course of the experiment or between repeated measures of the DV
Maturation
Threat to internal validity
• Participants may change in the course of the experiment or between repeated measures of the DV due to the passage of time
- Permanent changes: e.g., biological growth
- Temporary changes: e.g., fatigue
address this with a control group
maturation tends to be biological growth or fatigue
Statistical Regression
Threat to internal validity
• AKA Regression to the Mean
• You might select to study participants who perform
• However, participants with extreme scores on a first measure of the DV tend to have scores closer to the mean on a second measure
• Extreme scores are more likely to occur via chance
eliminate the possibility that the response was an outlier (eliminate random noise)
• Example:
• A person scored 750 out of 800 on the quantitative portion of the
• Assuming the same test difficulty and no learning/practice effect, what would you expect the person to score on the second test?
• Ways someone could score 750
1. “true” score is 750 & they had exactly average luck
2. “true” score is < 750 & they had better than average luck
3. “true” score is > 750 & they had worse than average luck
• Few people have “true” scores >750 (~6 in 1,000); more people have true scores 700-750 (~17 in 1,000)
• Thus, more likely that someone scoring 750 is from group 2 than group 3
Instrumentation
Threat to internal validity
• The reliability of the instrument used to gauge the DV or manipulate the IV may change over the course of an experiment
- Changes in calibration of a measuring device
- Changes in the proficiency of an observer/ interviewer (human “instruments”)
• Possible in any pretest-posttest design
• Importance of recruiting study groups in parallel
Selection
Threat to internal validity
• Groups differ in a systematic, non-random way prior to a study
• Intact groups (e.g., two groups of children in classrooms)
• Only weaker cause-effect conclusions are possible
• May co-occur with maturation differences (Different learning rate prior to the study)
• May co-occur with history differences
• Differences in teacher style in the two classrooms
Mortality
- In the course of an experiment, some subjects may drop out before it is completed (attrition)
- Particularly problematic if the reason for dropping out is non-random (examples?)
- How might you deal with this problem?
- Pre-test comparisons
- Intention-to-treat vs. as-treated analysis
non-compliance = people only showed up for 3/5 sessions
Experimenter/Participant Bias
Threats
Rosenthal effect, blinding, placebo effect, hawthorne effect,
Rosenthal effect
• AKA Pygmalion effect or self-
fulfilling prophecy
• Set up expectations (intentionally or unintentionally)
• Researcher Robert Rosenthal told
some teachers that children in their classes would undergo a jump in intellectual ability
• Differences arose likely just due to different expectations/interactions
Blinding
• Single Blind: Participant doesn’t know what
group they’re in
• Double Blind: Participant AND researcher
don’t know
• Triple Blind: Participant AND researcher
AND person scoring the data don’t know
Placebo Effect
Positive effect simply due to believing you’ve received treatment
Nocebo Effect
Negative effect simply due to believing you’ve received treatment
Hawthorne effect
- AKA Reactivity
- Named for a study of worker productivity at the Hawthorne plant of an electric company
- Productivity improved regardless of what was changed
- Behavior changes just because individuals know they are being studied
Case Example:
• There were an equal number of boys and girls in the class, so for convenience the boys were assigned to the Control Group and the girls to the Experimental Group.
• One day, the boys were told to go to one room and the girls to another room, where they were given their respective videos. Two days later, the Generalization Probe was conducted.
• The mean score for the Control Group was 1.2 and for the Experimental Group was 3.4. We conclude the video improved turn-taking behavior for children with autism.
What is this type of design called? What is the problem (i.e., is there a threat to internal validity)?
post test only non-randomized control group
selection issue
Case Example:
• On Day 1, all children viewed the 20-minute cartoon; On Day 2, the Generalization Probe for all students was conducted; On Day 3, all children viewed the 20-minute video; On Day 4, a second Generalization Probe was conducted for all students.
• The mean score for the Control video was 1.2 and for the Experimental video was 3.4. We conclude the video improved turn-taking behavior for children with autism.
What is the design type?
Is there a problem?
repeated measures
repeated testing threat
history threats
Case Example: There were an equal number of boys and girls, so for convenience the boys were assigned to the Control Group and the girls to the Experimental Group. During a class early in the school year, a Generalization Probe was conducted for all children. The experimenter fell ill soon afterwards, and so it wasn't until a class late in the school year that the children were separated into groups, with the control children viewing the 20-minute cartoon and the experimental children viewing the 20-minute video. Two days after, a second Generalization Probe was conducted. We conclude that the 20-minute interactive video improved the children's turn-taking skills.
What is the design type? Is there a problem?
non random assignment
pre test and post test design
maturation
Case Example:
• The name of each child in the classes was written on a separate slip of paper. All the slips were put in a bowl and mixed up thoroughly. Students were assigned to the Experimental and Control Groups alternately as their names were pulled out of the bowl.
• One day at school, the children in the Control Group were told to go to one room and children in the Experimental Group to another room, where they were exposed to their respective conditions.
• Some of the children in the Experimental Group appeared bored by the interactive video, became disruptive, and were removed from the room. Two days later, the Generalization Probe was conducted.
• The mean score for children in the Control Group was 1.2 and the mean score for the remaining children in the Experimental Group was 3.4. We conclude that the 20-minute interactive video improved the children’s turn-taking skills.
What is the design type? Is there a problem?
Random Assignment
Post test
mortality - loss of the kids who were disruptive- systematically losing kids from one group then the others
history - one group had a more disruptive atmosphere
Case Example:
- Andy swore a lot, a bad habit he hoped to kick. His wife, Betty suggested that Andy record every time that he used profanity during the day by transferring a bead from one pocket to another, and counting the beads transferred at the end of the day.
- “Do this for the first week,” she said, “and we’ll see how often you typically swear. Then, the next week continue counting, but we’ll add a new twist: I will only cook dinner on days when you have swore 50% less than your average daily count in the first week.”
single subject design - kind of pre/post experiment
pre-test, treatment
instrumentation- measuring himself
self fulfilling prophecy
experimenter bias
Writing research methods:
PARTICIPANTS
• Who was enrolled in your experiment
• Include major demographics that have an impact on the results of the experiment (i.e. if race is a factor, provide a breakdown by race).
• The accepted term for describing a person who participates in research studies is
a participant not a subject; When talking about a design, usually still say “within-subjects” or “between-subjects.”
Writing research methods:
Equipment
- AKA apparatus
- Any specialized equipment used for data collection (Eye trackers, Audio systems)
- Again, think about what information someone would need to replicate the study in a new lab
Writing research methods:
Materials
- Can include scripts, surveys, or software used for data collection
- Type and number of stimuli (e.g., 200 NU-6 words), how they were processed (e.g., low pass filtered at 4000 Hz), normed, counterbalanced
- May want to provide specific examples of materials or prompts, depending on your study
Writing research methods:
Procedure
- How to run the experiment
- A description of the experimental design and how participants were assigned conditions
- Identify IV, DV, control variables. Give variables clear, meaningful names
- Provide the key instructions to participants
- A step-by-step listing in chronological order of what participants did (Sometimes a picture is worth a 1000 words)
Writing research methods:
Analyses
- Any info critical for someone being able to conduct the same analysis as you did (but not the Results yet)
- Data cleaning/pre-processing steps (If you removed certain trials or participants from analyses, justify why and report how many)
- The type of analysis (e.g., 2x2 ANOVA with levels…), whether you will test for interactions (If you’re doing something non-standard, be sure to cite the literature to help you justify this approach)
- Statistical software, including version number