Final Flashcards
Threats to internal validity (8 points)
History, maturation, instrumentation, testing, statistical regression, intact groups, selection, mortality
Threats to external validity (4 points)
Selection bias, reactive effects of experimental arrangements, reactive effects of testing, multiple-treatment interference
How to counter threats to internal validity?
Use true experimental designs
What are the true experimental designs?
Pretest-posttest randomized control group design, pretest-only randomized control group design, Solomon randomized four-group design.
Nippold Article (9 points)
Focuses on treatment studies. Studies must be duplicable, target the right participants and ensure there is no interference, collect a baseline, include a control group, have matched experimental and control groups, control for test-retest reactivity, show improvements that generalize beyond treatment setting, control for placebo effect, have no conflict of interest.
Gillam and Gillam
PICO, Level of evidence, Must weigh internal evidence levels with external evidence levels (student-parent and clinician-agency vs. what the research says)
PICO
Patient, Intervention program, Comparison treatment, Outcome
Levels of evidence
1= RCTs and SRs, 2=Nonrandomized studies, multiple baseline designs, SRs, 3=Studies of multiple cases who receive same treatment, 4=Single case studies, 5=Expert opinion
Social validation
Are results meaningful? Can others notice the difference?
Content validity
Appropriateness of the content of a measure. Broad sample of content is better than a narrow one, important material should be emphasized, questions should be written to measure the appropriate skills.
Face validity
Opinion of experts on measure
Criterion validity
Comparing results of a test with a meaningful criterion that relates
Predictive validity
The extent to which a test predicts the outcome it is supposed to predict.
Construct validity
Using subjective judgements to create a hypothesis that is tested using empirical methods.
Judgmental-empirical validity
Indirect evidence
Criterion-related validity
Direct evidence
Correlation coefficients
Use with quantitative data
IRB and Belmont Report
Belmont report sets forth basic ethical principles required for research with human subjects. FWA=Federa Wide Assurance. 3 ethical principles are the basis of HHS/45 CFR 46=respect for persons, beneficence, justice. Secretary of Dept of Health and Human Services has oversight for system protecting human subjects.
3 famous violations of ethics
Willowbrook studies, radiation tests on mentally impaired boys, Tuskegee Syphilis study
Monster study
Orphans put under psychological stress to induce stuttering
Test-retest
Should only be used when learning/maturation is not possible. Compute correlation coefficient for scores.
Pros and Cons of Reversal
Pros of reversal: Demonstrate experimental control, time confounds unlikely (eg maturation). Cons of reversal: Difficult with long-lasting effect treatment, ethics with withdrawal.
Pros and Cons of Multiple Baseline
Pros: No need for reversal if using long-lasting effects, Cons: Individual differences, can take a long time.
What are SSDs good for?
Learning, treatment studies
Treatment fidelity/procedural validity
Otherwise known as reliability of IV. Calculated using point-by-point or total % agreement. Strategies that monitor and enhance accuracy and consistency of intervention to ensure it is implemented as planned, each component delivered to all participants the same way over time. Can also use a checklist to ensure procedural validity.
Reliability definition
Test yields consistent results
Validity definition
Test covers correct content
Simple random sample
Every member of a population is given an equal chance of being included in a sample
Systematic sampling
Every nth person is selected
Stratified random sampling
Participants are randomly selected from each strata proportionally
Cluster sampling
Researchers draw groups, not individuals. Must use large number of clusters because many times the clusters are more homogenous than the general population
Purposive sampling
Participants are chosen based on their usefulness in the study.
Snowball sampling
Used for participants who are hard to find. Tell one participant in the group and they tell more in same group to join study.
Measurement scales with examples
Nominal (Which L1 spoken), Ordinal (Reading level), Interval (math test), Ratio (weight)
Cookies experiment in class
t-test
Gummy bear experiment in class
2-way ANOVA
Chi square
Used for nominal data. Example of a bivariate analysis.
Practical vs statistical significance
Stat significant: differences in group means are not due simple to chance. In large samples, the difference can be statistically significant while not being practically significant.
Cohen’s d
Shows effect size. Calculated by finding the difference of the 2 means and dividing by the size of the standard deviation. Can range from -3 to +3.
Labels for Cohen’s d
.2=small, .5=medium, .8=large, 1.10=very large, 1.40=extremely large
Effect size for correlation?
r2
Effect size for t-test vs. ANOVA
t-test= cohen’s d, ANOVA (F)=treatment sum of squares
Meta-analysis
Set of statistical methods for combining results of previous studies. Uses mean. Meta-analyses are large and oftentimes reliable but lack the same validity that its component tests.
Meta-analyses and effect size
Calculating Cohen’s d for all studies to be included permits the averaging of values of d to get a meaningful result. This solves the problem of the different scales of the various included studies’ results (unable to be averaged).
SSD’s and ASD
SSD’s are good for ASD research. It is unethical for kids with ASD to be without treatment so putting them in a control group in an experimental study is not possible. Also, kids with ASD are very heterogeneous and therefore the experimental and control groups would not be similar.
In the study by Koegel and Koegel, what was the independent variable introduced in the B condition?
maintenance trials interspersed with acquisition trials
In the Kern et al study what was the baseline condition
typical morning greeting routine
What type of research design was used in the Kern et al. study
withdrawal design
The study by Koegel and Koegel used a multiple baseline design. What were the intervention conditions replicated across?
Academic tasks
In the Kern study, what type of reliability was calculated?
percent agreement using the point by point method
According to Cardon and Azuma, what are the two most common types of SSRDs used in autism intervention research?
withdrawal and multiple baseline
According to Cardon and Azuma, why is it important to replicate findings across additional labs?
to ensure that results are not due to the specific set of researchers and their students
What ethical violation occurred in the famous Willowbrook Case?
children in a mental hospital were given hepatitis
How did Sigafoos, Didden and O’Reilly measure reliability in their study on speech output in children with developmental disabilities?
they compared percent agreement between two observers
In terms of validity, What is the term for a standard by which a test is judged?
criterion
Part of the Belmont Report states that researchers should obtain maximum benefits and minimize possible harm. What principle is this?
Beneficence
You are writing your paper using APA style. What does APA stand for?
American Psychological Association
According to the Ethics video, how long is it recommended that researchers keep data?
5-7 years
Romski article
Randomized group design, NO CONTROL. This study compared the language performance of young children with developmental delays who were randomly assigned to 1 of 3 parent-coached language interventions. Differences in performance on augmented and spoken word size and use, vocabulary size, and communication interaction skills were examined. Sixty-eight toddlers with fewer than 10 spoken words were randomly assigned to augmented communication input (AC-I), augmented communication output (AC-O), or spoken communication (SC) interventions; 62 children completed the intervention. This trial assessed the children’s symbolic language performance using communication measures from the language transcripts of the 18th and 24th intervention sessions and coding of target vocabulary use. All children in the AC-O and AC-I intervention groups used augmented and spoken words for the target vocabulary items, whereas children in the SC intervention produced a very small number of spoken words. Vocabulary size was substantially larger for AC-O and AC-I than for SC groups.
Justice article
This study examined the impact of teacher
use of a print referencing style during classroom-based storybook reading sessions conducted over an academic year. Impacts on preschoolers’ early literacy development were examined, focusing specifically on the domain of print knowledge. This randomized, controlled trial examined the effects of a print referencing style on 106 preschool children attending 23 classrooms serving disadvantaged preschoolers. Following random assignment, teachers in 14 classrooms used a print referencing style during 120 large-group storybook reading sessions during a 30-week period. Teachers in 9 comparison classrooms read at the same frequency and with the same storybooks but used their normal style of reading. Children whose teachers used a print referencing style showed larger gains on 3 standardized measures of print knowledge: print concept knowledge, alphabet knowledge, and name writing, with medium-sized effects.
Kern article
Withdrawal design. ABAB, ABCAC. This study evaluated the effects of individually composed songs on the independent behaviors of two young children with autism during the morning
greeting/entry routine into their inclusive classrooms.
A music therapist composed a song for each child
related to the steps of the morning greeting routine and
taught the children’s teachers to sing the songs during
the routine. The effects were evaluated using a single
subject withdrawal design. The results indicate that the
songs, with modifications for one child, assisted the
children in entering the classroom, greeting the teacher
and/or peers and engaging in play. For one child, the
number of peers who greeted him was also measured,
and increased when the song was used.
Koegel and Koegel article
The article assesses the effects of task-sequencing variables on the academic performance of an 8-year old
severe stroke victim. Within a multiple baseline design, previously acquired (maintenance) task
trials were systematically interspersed at designated points in treatment among new (acquisition)
task trials. The results showed improvements in both academic responding and subjective ratings
of motivation in each of four treated areas (spelling, reading, word-finding, and memory). Social
validation data obtained from standardized school placement examinations suggested marked improvement in a variety of related areas of academic functioning. Results suggest that children
suffering severe strokes may be capable of learning more than has previously been suspected, and
that behavioral treatments may improve such children’s functioning.