Research Design & Statistics Flashcards
aim of science
discover systematic explanations for and/or rules governing natural phenomena
research
conduct systematic investigations and inquiries into the phenomenon (or phenomena) in question
research design
plan that specifies the research strategy — how subjects will be selected, how variables will be defined and measured, the conditions under which the research will be conducted, etc.
basic sequence of a scientific inquiry
1) hypothesis (or proposition) regarding the relationship between 2+ variables, is formulated
2) hypothesis is operationally defined (specify what exactly we should observe if the hypothesis is true)
3) collect and analyze data to test the hypothesis
variable
simply anything that varies;
not consistent or having a fixed pattern; liable to change
constant
something that does not vary;
factors that do not change during the experiment
independent variable (IV)
input variable — the event or treatment manipulated by the researcher
other names for IV
the treatment variable or experimental variable
dependent variable (DV)
is the outcome variable;
what is hypothesized to change as a result of manipulations of the independent variable;
measured to determine if they change as the result of the experimental manipulations
correlational research
investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them;
variables are measured not manipulated;
finding an association, not causation;
can be used to predict status on another variable
predictor variable
variable that is suspected to predict or correlate with an outcome variable
criterion variable
the outcome, result, or effect that researchers try to predict or explain in a study
levels
when applied to a variable, refers to the values it could take
factor design
statistical method used in experimental research that helps you study the effects of multiple factors simultaneously;
each level of one independent variable is combined with each level of the others to produce all possible combinations
internal validity
possible to determine whether a causal relationship exists between the IV and DV;
reasonably sure that the IV, rather than an extraneous (irrelevant) variable, is causally responsible for any observed change in the DV
one-group, pretest/post-test design
the dependent variable is measured once before the treatment is implemented and once after it is implemented;
subjects in one group are measured before and after they receive a treatment;
poor internal validity
extraneous variable
any variable not being investigated that has the potential to affect the outcome of a research study;
any factor not considered an independent variable that can affect the dependent variables or controlled conditions
confounded
experiment that is contaminated by an extraneous variable
equivalence
ensure that all the groups involved in a study are equivalent in every respect, except for their status on the IV
Threats to Internal Validity
history, maturation, testing, instrumentation, statistical regression, selection, differential mortality, experimental bias
TISSDEMH
history
any external event, besides the experimental treatment, that affects scores or status on the dependent variable
maturation
any internal (biological or psychological) change that occurs in the subjects while the experiment is in progress and exerts a systematic effect on the DV;
fatigue, boredom, hunger, physical or intellectual development
testing
testing is always a threat to internal validity in the one-group pretest/post-test design;
when the pretest and post-test are similar, subjects may show improvement on the post-test simply from their experience with the pretest
instrumentation
when the nature of the measuring instrument has changed;
raters’ assessment abilities have improved over time;
one way to control for this threat is to use highly reliable (dependable and consistent) measuring instruments
statistical regression
tendency of extreme (very high or very low) scores to fall closer to the mean (average) upon re-testing;
6YO child scored 180 IQ, will likely have a lower score 3 years later;
can threaten internal validity whenever extreme scorers are used as research subjects (v depressed individuals)
selection
pre-existing subject factors that account for scores on a DV
motivation, intelligence, self-esteem, etc.
differential mortality
when people who drop-out of one of the groups differ in systematic ways from people who remain in the study;
when a study involves 2+ groups
experimenter bias
behavior of subjects changes as a result of experimenter expectancies, rather than as a result of the independent variable;
ex: researcher may unconsciously communicate expectations to the subjects; researcher, consciously or unconsciously, makes errors in the direction of the research hypothesis when scoring or reporting the results
Rosenthal and Jacobson (1968) “Pygmalion in the Classroom”
teacher’s preconceived notions of a student’s ability resulted in the student’s grades and even IQ scores moving in the expected direction, even though the students themselves hadn’t changed
experimenter expectancy
AKA Rosenthal effect and the Pygmalion effect;
how the perceived expectations of an observer can influence the people being observed
how to overcome experimenter bias effects
using the “double-blind” technique, in which neither the subjects nor the experimenter know which group (experimental or control) subjects have been assigned to
double-blind technique
neither the subjects nor the experimenter know which group (experimental or control) subjects have been assigned to
random assignment (or randomization)
for all subjects in the experiment, the probability of being assigned to a particular group is the same;
considered the most “powerful” method for controlling extraneous variables;
all extraneous characteristics (including ones the researcher has not measured or even thought of) should be distributed to the groups equally
Random Assignment vs. Random Selection
random selection: method of selecting subjects into a research study; all members of the population under study have an equal chance of being selected to participate in the research
random assignment: something that takes place after the subjects have been selected; the probability of subjects who have already been selected being assigned to each group is the same
matching
identifying subjects (through a pretest) who are similar in terms of their status on the extraneous variable, then grouping similar subjects and randomly assigning members of the matched group to the treatment groups
when is matching useful
when the sample size is small;
random assignment cannot be counted on to ensure equivalency among the groups in term of the extraneous variable
blocking
studying the effects of an extraneous variable (a pre-existing subject characteristic) to determine if and to what degree it is accounting for scores on the DV;
making the extraneous variable another IV
matching vs. blocking
matching: ensure equivalency in terms of the extraneous variable; doesn’t add an IV
blocking: determine the effects of the extraneous variable; add a new IV and, therefore, add additional experimental groups
Holding the Extraneous Variable Constant
including only subjects who are homogenous in terms of their status on the extraneous variable;
completely eliminates the effects of an extraneous variable;
con: cannot be generalized to populations that are not sampled
analysis of covariance (ANCOVA)
statistical strategy for increasing internal validity;
after the data are obtained, DV scores are adjusted so that subjects are equalized in terms of their status on one or more extraneous variables
external validity
the generalizability of the results of a research study to other settings, times, or people
interaction
some variable has one effect under one set of circumstances, but a different effect under another set of circumstances;
term implies that a given effect is not generalizable; that is, it doesn’t work the same way under all circumstances
interaction between selection and treatment
effects of a given treatment would not generalize to other members of the population of interest (or target population)
Interaction Between History and Treatment
effects of a treatment do not generalize beyond the setting and/or time period in which the experiment was done
Interaction Between Testing and Treatment
results of research in which pretests are used might not generalize to cases in which pretests are not used
pretest sensitization
effect in which the administration of a pretest affects the subsequent responses of a participant to experimental treatments
demand characteristics
cues in the research setting that allow subjects to guess the research hypothesis
The Hawthorne effect
tendency of subjects to behave differently due to the mere fact they are participating in research
Order Effects (AKA Carryover Effects and Multiple Treatment Interference)
when participants’ responses in the various conditions are affected by the order of conditions to which they were exposed;
effect of being tested in one condition on participants’ behavior in later conditions
repeated measures design
studies in which the same subjects are exposed to more than one treatment
random selection
all members of the population under study have an equal chance of being selected to participate in the research
stratified random sampling
taking a random sample from each of several subgroups of the total target population;
purpose is to ensure proportionate representation of the defined population subgroups
cluster sampling
unit of sampling is a naturally occurring group of individuals, rather than the individual
multistage cluster sampling
taking of samples in stages using smaller and smaller sampling units at each stage
naturalistic observation
behavior is observed and recorded in its natural setting or in a setting as similar to the natural one as possible;
lacks internal validity
analogue research
results of lab research studies are used to draw conclusions about a real-world phenomenon;
the researchers made analogies about real-world phenomena based on studies involving contrived, laboratory situations ;
good internal, bad external validity
single-blind study
subjects are not informed of the purpose of the study and do not know which treatment they have been assigned to
counterbalancing
different subjects or groups of subjects receive the treatments in a different order;
to control for order effects
latin square design
ordering the administration of treatments so that each appears once and only once in every position
true experiment
investigator randomly assigns subjects to different groups, which receive different levels of a manipulated variable;
greatest internal validity
quasi-experimental designs
used when random assignment of subjects to groups is not possible;
involves the use of intact groups, rather than groups that are constructed on the basis of random assignment
developmental studies
assessing variables as a function of time (e.g., physical and psychological development)
3 types of developmental designs
longitudinal, cross-sectional, and cross-sequential
longitudinal study
same people are studied over a long period of time
cons of longitudinal studies
high cost (time and money); high subject dropout rates; and, in studies that involve assessing performance on a task, practice effects
why longitudinal designs tend to underestimate true age-related change
1) subjects who drop out of longitudinal designs tend to be those who are less able on the task studied, leaving the remaining subjects will be relatively high in ability, and the data will show a misleadingly low level of age-related decline;
2) practice effects can facilitate performance on the dependent variable
cross-sectional design
different groups of subjects, divided by age, are assessed at the same time;
tend to overestimate true age-related declines in performance
cohort effects (AKA intergenerational effects)
observed differences between different age groups may have to do with experience rather than age
cross-sequential design
representative samples of different age groups are assessed on two or more occasions
time-series design
taking multiple measurements over time (usually multiple pretest and post-test measures) to assess the effects of an IV;
the series of measurements on the DV is interrupted by the administration of a treatment
advantage of multiple measurements
allow one to rule out many threats to internal validity, such as maturation, regression, and testing;
biggest threat is history
two-group time-series design
take the same measurements from a comparison “control” group that is comparable to the one studied
Single-Subject Designs
number of subjects is one;
well-suited to research on behavior modification since the researcher is able to analyze the behavior before and during treatment - DV is measured several times during both phases
types of single subject designs
“AB” design, “reversal” design, and “multiple baseline” design
AB design
involves a single baseline phase and a single treatment phase;
con: easy for any observed change in behavior in the treatment phase to be due to a historical event or other extraneous factor
Reversal (or Withdrawal) Design
treatment is withdrawn and data are collected to determine if the behavior returns to its original level upon this withdrawal;
ABAB design, in which the treatment is re-applied after the second baseline phase
Multiple-baseline designs
when cannot use reversal design;
applying the treatment sequentially (across different baselines);
treatment may be applied sequentially across different behaviors of the same subject (multiple baseline across behaviors), to the same subject in different settings (multiple baseline across settings), or to the same behavior of different subjects (multiple baseline across subjects)
qualitative or descriptive research
type of research in which the investigator doesn’t start with a theory; theory is developed from the data rather than derived a priori (beforehand)
qualitative methods of research
participant observation, nonparticipant observation, interviews, surveys, case studies
surveys
used in areas such as attitude measurement, consumer preferences, and worker satisfaction studies;
3 basic techniques - personal interviews, telephone surveys, and mail surveys
case study
detailed examination of a single case (single individual, group, or phenomenon);
based on the assumption that the case under study can be viewed as an example of a more general class;
from an experimental POV, case studies don’t allow one to conclude the nature of relationships between variables (lack internal validity), and their results may not be generalizable to other cases (may lack external validity
protocol analysis
loosely applies to research involving the collection and analysis of verbatim reports;
subject is asked to think aloud as he or she is performing a task while the researcher records everything the subject says (this record is referred to as a protocol);
researcher analyzes the data in an attempt to identify cognitive processes involved in performing the task;
analysis is based on the researcher’s interpretation of the verbal protocol
statistics
methods of measuring variables and organizing and analyzing data
descriptive statistics
describe a set of data collected from a sample
inferential methods
used to make inferences about an entire population on the basis of sample data
nominal
divides a variable into unordered categories into which the data may fall;
qualitative data that groups variables into categories that do not overlap;
categories are not ordered;
“sex,” “diagnostic category,” “hair color”
ordinal
variables have natural, ordered categories and the distances between the categories are not known;
Category 1 has less (or more) of the given attribute than Category 2;
ranks, satisfactory ratings, education
interval
numbers are scaled at equal distances, but the scale itself has no absolute zero point;
measured along a numerical scale that has equal distances (intervals) between adjacent values;
can add and subtract but can’t multiply or divide;
IQ, temperature
ratio
identical to interval scales, except they have an absolute zero point;
multiplication and division require a ratio scale;
money, distance, time
3 types of descriptive statistics
frequency distributions, measures of central tendency, measures of variability
frequency distribution
provides a summary of a set of data;
indicates the number (frequency) of cases that fall at a given category or score or within a given score range;
can be graphically displayed on a table, polygon, bar graph (histogram)
cumulative frequency (cf)
total number of observations that fall at or below the given category or score
histogram
scores are plotted on the x-axis (or abscissa), and frequency of occurrence of each score is plotted on the y-axis (or ordinate)
normal distribution
data are symmetrically distributed with no skew;
most values cluster around a central region, with values tapering off as they go further away from the center
negatively skewed
larger proportion of the scores falls toward the high end of the scale and relatively few scores fall toward the low end of the range of scores;
has a long tail on the left (the negative end of the distribution) and “lump” of scores on the right;
negatively skewed = easy test
positively skewed
larger number of scores at the low end of the scale (to the left side of the range of scores) and a long tail to the right (the positive end);
positively skewed = difficult test
mean
arithmetic average;
most useful measure of central tendency;
very sensitive to extreme values
median (Md)
middle value of the data when ordered from the lowest to the highest;
more useful measure of central tendency when a distribution is skewed
mode
most frequent value in a collection of numbers
multimodal
distribution with multiple modes
bimodal distribution
distribution with two modes
Relationship Between the Mean, the Median, and the Mode
normal distribution: 3 measures are equal;
positively skewed distribution: mean > median > mode;
negatively skewed distribution: mode > median > mean