Research Design, Statistics, & Test Construction Flashcards
Quasi-experimental design
at least one IV is manipulated, but there is no random-assignment of participants (typically because already in pre-existing groups)
Within-subjects design
groups compared are correlated or related; three conditions lead to this: repeated measures of same participants, subjects matched prior to assignment to groups, subjects have an inherent relationship (e.g., twins)
Latin square
most sophisticated form of counterbalancing subjects in a repeated measures design
Mixed design
includes groups that are both independent and correlated (e.g., patients randomly assigned to two different treatment groups and measured before and after treatment)
Idiographic
refers to single subject approaches (single or few participants studied intensely); AB, ABAB, multiple baseline, simultaneous treatment, and the changing criterion
Nomothetic
group approaches to research design (as opposed to single subject)
Autocorrelation
effect of measuring same person repeatedly; results in highly correlated data; problem of single subject design
AB design
baseline condition (A) followed by treatment condition (B); most significant problem is threat of history (difficult to determine whether intervention or other event caused change)
ABAB design
baseline (A) and treatment (B) alternated in ABAB sequence; protects against threat of history; two potential problems: failure of behavior to return to baseline, issues of ethics with removing effective treatment
Multiple baseline design
treatment is applied sequentially or consecutively across subjects, situations, or behaviors
Simultaneous (alternating) treatment design
two or more interventions implemented concurrently during the treatment phase that are balanced and varied across time of day
Changing criterion design
attempt is made to change behavior in increments to match a changing criterion (e.g., slowly reducing number of cups of coffee)
Momentary time sampling
simply recording whether target behavior is present or absent at moment that time interval ends
Whole-interval sampling
scoring target behavior positively only if exhibited for full duration of time interval
Analogue research
evaluates treatment under conditions that only resemble or approximate clinical situations; typically for less severe conditions; tight experimental control but limited generalizability (e.g., grad student clinicians using manual)
Clinical trials
outcome investigations conducted in clinical settings; often involve methodological compromises and sacrifices
Cross-sequential research
also called cohort-sequential research; takes several cross sections and follows them over briefer periods of time
Stratified random sampling
population is first divided into strata (e.g., age levels, income levels, ethnic groups), and then a random sample of equal size from each stratum is selected
Proportional sampling
individuals are randomly selected in proportion to their representation in the general population
Systematic sampling
selecting every kth element after a random start, e.g., if 100 out of 1000 persons are needed, every tenth person is selected; needs to be arranged in such a way that it is not biased
Cluster sampling
identifying naturally occurring groups of subjects (clusters) and randomly selecting certain clusters (e.g., classes or departments at a university, or schools within a particular school district)
History
threat to internal validity; incidents that intervene between measuring points, either in or outside of the experimental situation; best control is a control group
Maturation
threat to internal validity; factors that affect the subjects’ performance because of the passing of time (fatigue, maturing); best control is a control group
Testing or test practice
threat to internal validity; occurs when familiarity with testing affects scores on repeated testing; best control is Solomon Four-Group design
Solomon Four-Group design
control for testing threats to validity; divide subjects into four groups: measured pre- and post- and get intervention; measured pre- and post- and don’t get intervention, measured post and gets intervention, measured post and does not get intervention
Instrumentation
threat to internal validity; changes in observers or the calibration of equipment; control group corrects for this
Statistical regression
threat to internal validity; tendency for extreme scores (scores very much above or below the mean ) to become less extreme (closer to the mean) on retesting, even without any type of intervention; control group controls for this
Selection bias
threat to internal validity; caused by non-random assignment; best avoided with random sampling
Attrition or experimental mortality
threat to internal validity; differential loss of subjects from the groups; to assess for this, compare subjects who drop out using t-tests on relevant variables
Diffusion
threat to internal validity; occurs when no treatment group gets some of the treatment; difficult to eliminate completely, but tighter control over experimental situation can help
Construct validity
refers to factors other than the desired specifics of our intervention that result in differences; often lumped under threats to external validity; not measuring what you think you are measuring
Attention and contact with clients
threat to construct validity; difficult to tell whether changes are due to treatment or attention
Experimenter expectancies
threat to construct validity; cues or clues transmitted to the subjects by the experimenter; Rosenthal effect; can be controlled by masking experimenter to conditions
Rosenthal effect
refers to experimenter expectancies
Demand characteristics
threat to construct validity; factors in the procedures that suggest how the subject should behave; control by masking subjects to their condition
John Henry effect
hreat to construct validity; occurs when persons in a control group try harder than usual in the spirit of competition with the experimental group; control by making sure experimental and control groups do not know about each other and, if not possible, do not give groups any sense of competition
Threats to external validity
interfere with generalizability of effects
Sample characteristics
threat to external validity; difference between sample and population
Stimulus characteristics
threat to external validity; features of the study with which the intervention is associated (e.g., research assessing memory functioning in the laboratory may not be generalizable to memory functioning in naturalistic settings)
Contextual characteristics
threat to external validity; conditions in which intervention is embedded; e.g., reactivity
Reactivity
subjects behave in a certain way just because they are participating in research and being observed
Low power
threat to statistical conclusion validity; diminished ability to find significant results; small sample size and inadequate interventions can contribute
Unreliability of measures
threat to statistical conclusion validity; unreliable outcome measure
Variability in procedures
threat to statistical conclusion validity; inconsistency in treatment procedures; especially of concern in psychotherapy outcome research
Subject heterogeneity
threat to statistical conclusion validity; subject heterogeneity makes it more difficult to find significant differences between groups
Varies directly with
as one variable increases so does the other (e.g., a varies directly with b in a=b/c
Varies indirectly with
as one variable increases the other decreases (e.g., a varies indirectly with c in a = b/c)
Ordinal data
involve tallying people to see which ordered category a person falls into (e.g., likert scale, SES, percentile rank); group means cannot be calculated
Interval data
involve obtaining numerical scores for each person, where the score values have equal intervals; no zero score or zero is absolute (e.g., IQ test, t-score, temperature); group means can be calculated
Ratio data
involve obtaining numerical scores for each person, where the score values have equal intervals and an absolute zero (e.g., score on EPPP, money in bank, weight, number of children)
Standard deviation
average deviation (or spread) from the mean in a given set of scores
Variance
standard deviation squared
Positive skew
higher proportion of scores in the lower range of values (mode has lowest value, mean highest)
Negative skew
higher proportion of scores in the higher ranges of values (mean has lowest value, mode highest)
Kurtosis
refers to how peaked a distribution is
Leptokurtotic
distribution with a very sharp peak
Platykurtotic
distribution that is very flat
Criterion-referenced or domain-referenced score
example is percentage correct
Norm-referenced score
provides information on how person performed relative to group
Standard scores
based on standard deviation from the sample
Z-scores
standard scores that correspond directly to standard deviation units; transforming into Z-scores does not normalize a distribution (exact same distribution shape); z score = (score - mean)/SD
Z-scores and percentile ranks
-3 = .1, -2 = 2.5, -1 = 16, 0 = 50, 1 = 84, 2 = 97.5, 3 = 99.5
Parameters
population values
Statistics
sample values
Standard error of the mean
average amount of deviation of sample means from the population mean; equal to population SD divided by square root of sample size