Ch. 8 Internal and External Validity Flashcards
Internal and External Validity
INTERNAL VALIDITY – only the Independent variable has an effect on the Dependent variable – there are no external or confounding factors. (This is absolutely necessary)
EXTERNAL VALIDITY – The results of the study are generalizable to the general public or the greater intended population of the study (women, Americans, etc.) (NOT absolutely necessary, but desirable)
- After you have satisfied yourself that your experiment (or an experiment that you are evaluating) is internally valid, then you can focus on its potential for external validity.
- POPULATION GENERALIZATION – concerned about applying our findings to organisms beyond our actual experimental participants. We must concern ourselves with whether our results will apply to a larger group than our experimental participants.
- ENVIRONMENTAL GENERALIZATION – refers to the question of whether our experimental results will apply to situations or environments that are different from those of our experiment.
- TEMPORAL GENERALIZATION – describes our desire that our research findings apply at all times and not only to a certain time period.
Threats to Internal Validity
THREATS TO INTERNAL VALIDITY:
-
HISTORY – refers to any significant event, other than the Independent Variable, that occurs between Dependent Variable measurements. We must beware of the possible extraneous variable of history when we test or measure experimental participants more than once.
-
Ex: In a test of Nazi propaganda, the measurement of the Dependent Variable happened both before and after the invasion of Paris. The subsequent attitude measurement was probably much more influenced by that historical event than by the propaganda.
- This distraction would then serve as an extraneous variable, and the experiment would be confounded.
-
Ex: In a test of Nazi propaganda, the measurement of the Dependent Variable happened both before and after the invasion of Paris. The subsequent attitude measurement was probably much more influenced by that historical event than by the propaganda.
-
MATURATION – refers to systematic time-related changes, but mostly of a shorter duration than we would typically expect from the term. For example, MATURATION would occur when participants grow tired, hungry, or bored through the course of an experiment.
- Maturation would be possible in an experiment that extends over some amount of time. How much time must occur before maturational changes can take place could vary widely depending on the experiment’s demands on the participants,
- Maturational changes are more likely to occur in an experiment that uses repeated measurements of the DV.
-
TESTING – Testing is a definite threat to internal validity if you test participants more than once. If you take the same test more than once, scores on the second test may vary systematically from the first scores simply because you took the test a second time. This is known as a PRACTICE EFFECT, in which there is a beneficial effect on a DV caused by previous experience with the DV.
-
REACTIVE MEASURES – A measurement that is reactive changes the behavior in question simply by measuring it.
- Ex: If you measure an attitude about something and then come back later and measure the same attitude, the attitude will have changed simply because you already measured it.
- Many attitude questionnaires are reactive measures.
- Once people have been measured in some way, they may become defensive, may guard against giving answers that would make you look prejudiced.
-
NONREACTIVE MEASURES – do not alter the participant’s response by virtue of measuring it.
- Tools and techniques for nonreactive measures: one-way mirrors, hidden cameras and microphones, naturalistic observation, deception, and so on.
- If we can obtain some type of behavioral measure that is harder to fake than a questionnaire response, we may get a truer measure of your attitude
-
REACTIVE MEASURES – A measurement that is reactive changes the behavior in question simply by measuring it.
-
INSTRUMENTATION (INSTRUMENTATION DECAY) – referred to changes in measurement made by various apparatuses, although his list now seems quite outdated (e.g., “the fatiguing of a spring scales, or the condensation of water vapor in a cloud chamber,” p. 299). The failure of human observers, judges, raters, and coders are also included in this category.
- Thus, the broad definition of instrumentation refers to changes in the measurement of the Dependent Variable that are due to the measuring “device,” whether that device is an actual piece of equipment or a human.
-
STATISTICAL REGRESSION – If you remeasure participants who have extreme scores (either high or low), their subsequent scores are likely to regress or move toward the mean.
- When you score near the high end of a test’s possible score, there isn’t room to do much except do worse.
- If you select participants on the basis of extreme scores, you should beware of regression to the mean as a possible explanation for higher or lower scores on a repeated (or similar) test.
-
SELECTION – selection can serve as a threat to internal validity. Before we conduct our experiment, it is imperative that we can assume that our selected groups are equivalent. Starting with equal groups, we treat them identically except for the IV. If groups are not equal before the experiment, then those differences CONFOUND the experiment.
- Con’t confuse the selection problem** with the **assignment of participants to groups.
-
Selection typically refers to using participants who are already assigned to a particular group by virtue of their group membership.
- Ex: it is likely that there are differences in people who watch and who do not watch soap operas that are unrelated to the actual content of the soap operas. Thus, using soap opera viewers and non-soap opera viewers as groups to study the impact of a particular soap opera episode dealing with rape is probably not the best way to study the impact of that episode.
-
Selection typically refers to using participants who are already assigned to a particular group by virtue of their group membership.
- Con’t confuse the selection problem** with the **assignment of participants to groups.
-
MORTALITY – In human research, the term mortality typically refers to experimental dropouts. In animal research, it could literally mean the death of the participants (ex: rats)
- Mortality could become a threat to internal validity if a particular treatment were so severe that significant numbers of participants in the treatment group died/dropped out.
- Although the other groups would still represent random samples, those in the particular treatment group would not.
- If one group shows a higher dropout rate, your experiment may lack internal validity.
-
INTERACTIONS WITH SELECTION – Interactions with selection can threaten internal validity when the groups we have selected show differences on another variable (i.e., maturation, history, or instrumentation) that vary systematically by groups.
- Ex: Children’s language proficiency at age 2 (which can be profoundly different among various socioeconomic groups) would show a selection–maturation interaction that would pose a threat to internal validity, given that the same people at age 1 and age 6 might have very similar proficiency.
- Ex: A selection–history interaction that would jeopardize internal validity could occur if you chose your different groups from different settings—for example, different countries, states, cities, or even schools within a city. This problem is especially acute in cross-cultural research.
- Ex: An example of a selection–instrumentation interaction (based on human “instrumentation”) would be using different interpreters or scorers for participants who come from two different countries, assuming the IV was related to the country of origin.
-
DIFFUSION OR IMITATION OF TREATMENT – creates a problem with internal validity because it negates or minimizes the difference between the groups in your experiment.
- Ex: If the informed participants communicate the vital information to the supposedly uninformed participants, then the two groups may behave in similar manners.
- Experiments dealing with learning and memory are particularly susceptible to the imitation of treatments.
PROTECTING INTERNAL VALIDITY: Two approaches:
- implement the various control procedures.
- use of a standard procedure.
- Experimenters use standard research procedures, called experimental designs, to help ensure internal validity
- Internal validity is the most important property of any experiment!
Threats to External Validity
THREATS TO EXTERNAL VALIDITY (BASED ON METHODS):
-
INTERACTION OF TESTING AND TREATMENT – The interaction of testing and treatment is the most obvious threat to external validity, and it occurs for the pretest-posttest control group design. External validity is threatened because both groups of participants are pretested and there is no control to determine whether the pretesting has had an effect.
- Recall the REACTIVITY EFFECT, where merely taking a pre-test changes the subjects’ attitudes toward the treatment that comes afterward.
- Because of a pretest, your participants’ reactions to the treatment will be different. The pretest has a sensitizing effect on your participants; it is somewhat like giving a giant hint about the experiment’s purpose before beginning the experiment.
- The pretesting effect is particularly troublesome for experiments that deal with attitudes and attitude change.
- This testing and treatment interaction is the reason why researchers developed nonpretesting designs such as the posttest-only control-group design.
- Also, although the Solomon four-group design does incorporate pretests for two groups, it includes no pretesting of the remaining two groups, thus preventing measurement of any pretesting effect.
- Recall the REACTIVITY EFFECT, where merely taking a pre-test changes the subjects’ attitudes toward the treatment that comes afterward.
-
INTERACTION OF SELECTION AND TREATMENT – An interaction of selection and treatment occurs when the effects that you demonstrate hold true only for the particular groups that you selected for your experiment.
- The threat of a selection–treatment interaction becomes greater as it becomes more difficult to find participants for your experiment. As it becomes harder to get participants, it becomes more likely that the participants you do locate will be unique and not representative of the general population.
-
REACTIVE ARRANGEMENTS – refer to those conditions of an experimental setting (other than the Independent Variable) that alter our participants’ behavior.
- Ex: the highly contrived situation in a laboratory in which we attempt to measure a real-world behavior may be too contrived and result in a dismal failure to produce real-world behavior.
- The reason is that such conditions in the experimental situation, clue the participant into the mindset of “The play-acting, outguessing, up-for-inspection, I’m-a-guinea-pig”, or whatever attitudes that generates an unrealistic response.
-
DEMAND CHARACTERISTICS – These characteristics can convey the experimental hypothesis to the participants and give them clues about how to behave.
- Rather than responding to you as an individual, people are responding to the demand characteristics that they think the study is looking for OR that they see in the contrived experimental setting.
- It is impossible to design an experiment without demand characteristics.
- Reactive arrangements tend to increase demand characteristics.
-
MULTIPLE TREATMENT INTERFERENCE – this threat to external validity can occur only in experiments that involve presenting more than one treatment to the same participants (repeated-measures designs) and it entail a confounding effect among the multiple treatments.
- If they received only one treatment, the experimental results might be different.
THREATS TO EXTERNAL VALIDITY (BASED ON PARTICIPANTS):
-
INFAMOUS WHITE RAT – Norwegian Rats are BY FAR, the nonhuman animal used most in psychological studies. Two obvious concerns with external validity arise from these data.
- First, if you are interested in the behavior of subhumans, generalizing from rats (and pigeons) to all other animals may be a stretch.
- Second, if you are interested in generalizing from animal to human behavior, there are certainly closer approximations to humans than rats (and pigeons) in the animal kingdom.
- UBIQUITOUS COLLEGE STUDENT – College students account for most study participants (CONVENIENCE SAMPLING) – not accurate to generalize from college students to the general public.
- “WEAKER” SEX – Beware of drawing sexist conclusions from your research
- EVERYBODY WAS WHITE – Include minority groups in your studies
- EVEN THE MINORITIES WERE AMERICAN – if all participants are American, then you can only generalize across America – not a true cross-cultural study.
External Validity NOT Always Necessary
EXTERNAL VALIDITY IS NOT ALWAYS NECESSARY – there is much to be learned from studies that are not generalizable.
- REPLICATION – When we replicate an experimental finding, we are able to place more confidence in that result. As we begin to see the same result time after time, we become more comfortable with the idea that it is a predictable, regularly occurring result.
- REPLICATION WITH EXTENSION – once you find something interesting using college students, for example (clearly without external validity), then you can EXTEND to other groups (e.g. the elderly, Hispanics, etc.), thus increasing external validity with each extension.
- In fact, complete External Validity in any one experiment is IMPOSSIBLE.
- However, when you have the choice, opt for greater EXTERNAL VALIDITY.