Research Methods Flashcards
Between vs Within Subjects Designs
Within-subjects designs are studies in which participants are tested repeatedly (i.e., repeated measures design). Participants might be tested pre- and post-treatment, be placed in both the experimental and control conditions, or tested repeatedly across time points. This design is beneficial in that characteristics among individuals should remain stable, reducing error, and a smaller sample is required. However, within-subjects designs are prone to order, practice, fatigue, carryover, and sensitization effects due to repeated testing. An example of a within-subjects design might be testing a drug on individuals with an extremely rare disease where more participants cannot be collected and comparing improvements over time.
Between-subjects designs are studies in which different groups of participants undergo different aspects of an experiment, typically only receiving one condition. Participants would be tested in either the control or experimental conditions and are usually randomly assigned to reduce error and confounding effects. Between-subjects designs can include pretest-posttest control groups (4) or intervention/control groups (3). An example of a between-subjects design might be a CBT treatment where individuals are randomly assigned to a waitlist, control (e.g., TAU), or intervention condition and the individual groups’ symptoms compared five weeks post-condition.
External validity and generalizability
External Validity & Generalizability - External validity is the extent to which the results of an experiment can be generalized beyond the specific conditions of that experiment to other groups or settings (i.e. out of the lab and into the real world). External validity consists of both generalizability, or whether results can be replicated under different conditions, and ecological validity, or whether results can be applied to specific settings. Considering external validity in research is important because the interaction between an independent and dependent variable might rely on other variables or be situational to that specific study, limiting a finding’s generalizability to other groups and situations. Subsequent studies can help answer these questions about confounding variables and interactions. When a researcher hopes to increase the external validity of their research, they might hope to increase its generalizability to broader groups (e.g., studies with children to be applicable among teens); however, the nature of applying sample-based findings to a target population means findings may not apply to others without infinitely researching other groups. Overgeneralization can occur when findings from a restricted sample are used without proper replication studies under expanded conditions; for instance, it would be inappropriate to use a drug for people with varied ethnic backgrounds if it had only ever been tested among European individuals.
Internal validity
Internal Validity – Internal validity is the extent to which a study successfully rules out or makes implausible alternative explanations for the results, usually by careful study design. In other words, greater internal validity indicates that effects on the DV can be confidently attributed to the IV. Successful internal validity can result from careful control of confounding factors that might have jeopardized interpretation of the results.
Threats to external validity
Threats to external validity might be used as criticisms of most studies but can come across as superficial without plausibility that the factor actually restricts generalizability of the result. For instance, researchers can easily criticize a study by stating the investigator did not examine an effect within an older population. 1) Sample characteristic threats refer to results relying on the demographic or natural characteristics of participants (e.g., undergraduates). 2) Narrow stimulus sampling threats refer to results being limited to the stimulus, materials, or researchers involved, limiting their extension to non-experiment conditions. 3) Reactivity of experimental arrangements and of assessment refer to participants’ awareness that they are part of an experiment that may influence non-experimental generalizability as well as awareness that influences how a participant might respond to questions (e.g., answering favorably). 4) Test sensitization threats refer to participants becoming aware of pre-test assessment procedures that influence or trigger them to respond differently upon follow-up (e.g., greater insight into experiment; “aha! I was asked xx”). 5) Multiple-Treatment interference threats refer to order effects, or the difficulty in ascertaining whether treatment effects result from separate treatments, or the order of conditions given. 6) Novelty effect threats refer to new aspects or environments in an experiment contributing to results rather than the intervention itself (e.g., new therapy setting).
Threats to internal validity
Threats to internal validity are important to control in order to confidently determine that effects on a DV truly result from an IV. Random assignment of participants to groups is integral to increasing internal validity. Further, using control groups, replication studies, and intent-to-treat analyses can be ways to strengthen internal validity. 1) Historical threats refer to events external to the intervention that occur to all participants in their lives (e.g., natural disaster) that could be attributed to the results (e.g., greater PTSD). 2) Maturation threats refer to naturally occurring processes within participants over time (e.g., growing stronger throughout the year). 3) Testing threats refer to changes in scores attributed to repeated assessment and familiarity with the test (e.g., participant purposefully keeping consistent responses from memory). 4) Instrumentation threats refer to possible artificial differences in scoring the dependent variables due to miscalibration or observer drift in which coders become less reliable over time (e.g., fatigue). 5) Statistical regression threats refer to the phenomenon that occurs when utilizing participants with extreme scores; random error on subsequent testing may result in scores drifting towards the mean and away from extremes which might call into question intervention treatment. 6) Selection bias threats refer to natural differences between participants that may influence treatment differences in results rather than treatment conditions themselves (e.g., volunteers). 7) Attrition/Mortality threats refer to the possibility that participants who drop out may share specific characteristics resulting in treatment effects only applying to the “survivors” of the study. 8) Diffusion of treatment threats refers to instances in which control groups accidentally receive parts of the intervention, or the intervention group fails to receive all parts; this threat dilutes true treatment effects in the results. 9) Compensatory equalization of treatment threats refers to control group participants dropping out of the study or seeking intervention elsewhere when randomly assigned to the non-intervention group, and accidental compensation for this may dilute effects of the intervention (e.g., giving kids with no math program more access to laptops).
Regression to the mean
Statistical regression threats refer to the phenomenon that occurs when utilizing participants with extreme scores; random error on subsequent testing may result in scores drifting towards the mean and away from extremes which might call into question intervention treatment.
Non-Random Assignment
For some research questions, random assignment is not feasible and assignment to groups cannot be carried out randomly, so people are placed in groups in a non-random fashion. In such cases, factors (i.e., confounds or covariates) that affect the observed relationship need to be minimized. Sometimes, non-random assignment may be used to rule out rival hypotheses that threaten internal validity or used when it would be unethical to randomly assign members. The researcher needs to attempt to determine the relevant covariates, measure them adequately, and adjust for their effects either by design or by analysis; if not done so, the theoretical validity of the study may be reduced. It is necessary to describe methods used to attenuate sources of bias, including plans for minimizing dropouts, noncompliance, and missing data.
Demand characteristics
Because individuals naturally attempt to understand what is happening to them and the meaning of events, participants will frequently try to understand the nature of the research and hypothesize about its goals. In doing so, participants may attempt to adjust the outcome of the research as a form of reactivity. Participants might make assumptions based on the information provided through informed consent or by deriving information from study procedures. Demand characteristics can be controlled by 1) Reducing cues within the research including obvious manipulations or limiting participants to single conditions (e.g., displaying ageism when presented with older and younger pictures), 2) Increasing motivation of participants by reminding them of active choice (e.g., can leave at any time), 3) Incorporate “fake good” role-playing procedures to estimate true versus purposefully fake participant responses, and 4) Separating the dependent variable from the study (e.g., deception practices that allow participants to believe the study has ended).
Manipulation check
Manipulation checks are important to ensure construct validity of experiments and interventions. Manipulation checks can be especially helpful in situations in which the intervention had no effect, or when experiment conditions need to be especially distinct to examine appropriate levels of intervention. For instance, researchers might want to ensure that participants in each condition are receiving the appropriate levels of intervention and manipulating only the independent variables of interest. Participants can provide self-report data to determine whether the intervention had its intended effect (e.g., did showing kitten pictures lead to more joy), or experimenters can add dependent variables to the study as manipulation checks to determine the same (e.g., in an aversion eye-gaze task, asking people to rate the powerfulness of people posing for corroboration). Manipulation checks might accidentally sensitize participants to being manipulated, accidental selection bias by excluding participants who fair manipulation checks, and these checks are not usually validated or sensitive measures.
Independent vs dependent variable
Studies in psychology are usually designed to test specific hypotheses and “if-then” statements. The “if” portion refers to the IV and the “then” part refers to the dependent variable or outcome. The IV contains conditions that are varied or manipulated to produce changes or differences among conditions in a study. 1) Environmental (varying that is done to, done with, or done by the subject…ie experimental or control group), 2) Instructional (variations in what the participant are told or are led to believe they are), and 3) Subject/individual difference (attributes or characteristics of the individual subject … gender or race). The researcher typically controls the independent variable and the IVs can be quantitative, qualitative, discrete or continuous.
A DV is hypothesized to be affected. The aim of an experiment is to learn whether and how the DV has been affected by the IV, usually measured through behaviors. In correlational research, the DV effect is what one hopes to predict or explain. A DV and is often a continuous variable but can be categorical. DVs are hoped to be related to IVs, but a relationship cannot be determined with absolute certainty.
Experimenter expectancy effects
Experimenter expectancy effects occur when experimenters’ expectations of participant responses result in behaviors, facial expressions, or attitudes that affect participants’ responses. This occurs because experimenters hope to find data that support their hypotheses, or they hope that early detected patterns will result in later predicted data patterns. These can result in 1) biased observations, in which hopes and expectations might result in biased ratings (e.g., expecting female participants to be emotional results in greater emotion ratings), or 2) influencing participant responses by treating participants differently based on experimenter assumptions (e.g., nonverbal or verbal feedback to correct answers). Experimenter expectancy effects can be reduced by minimizing contact with participants, utilizing detailed scripts, masking conditions from the experimenters, and disallowing snooping for patterns in data.
Power
Power is defined as the probability of correctly rejecting a false null hypothesis (or 1-Beta). Power is influenced by (1) alpha level, (2) effect size, (3) sample size and as each of these increase, so does power. Approximately 80% power is recommended by many researchers. Alpha level: if you decrease/make more stringent the alpha level, it becomes harder to reject the null, the threshold moves farther out on the tail, which means we are more likely to fail to reject the null when the null hypothesis is false (beta probability increases, power decreases). Increasing alpha will make it easier to reject the null which will decrease the probability that we fail to reject the null when the null is false. Therefore, (1-beta) power increases. If we increase effect size (= increase of distance between the centers of distribution) probability of Type II error (beta) will decrease causing power to increase. As sample size increases, our estimation of population mean and SD gets better and variability in the sample means goes down causing power to increase as sample size increases. Low statistical power leads to low probability of detecting differences between groups when one truly exists. Other things that could lead to low power would be subject heterogeneity and unreliable measures. Power analysis (need 3 of 4): effect size, alpha, power, and number of participants
Mediation vs Moderation
For both mediation and moderation, as outlined by Baron and Kenny (1986) is that the role of a third variable plays an important role in governing the relationship between two other variables. For us to claim a mediation effect the first step is to show a significant relationship between independent variable and the mediator and then show a significant relationship between the mediator and the dependent variable and then a significant relationship between the IV and DV. However, others have argued that these requirements for mediation result in very low power. The final step in proving mediation is that once the mediator and independent variable are used simultaneously to predict the dependent variable then the significant path between the independent and dependent variable is now reduced, if not insignificant. Therefore, mediation tests the causal chain X leads to change in mediator which leads to change in Y and explains why X affects Y. Ex. SES predicts parental education levels. Parental education levels predict child reading ability. Parental education levels are shown to explore the relationship between SES and reading ability.
Moderation refers to situations in which the relationship between the independent and dependent variables changes as a function of the level of a third variable (the moderator). The moderator can influence the magnitude and/or direction of the relationship between the two variables. Testing for moderation includes seeing whether there is an interaction between predictor x moderator that significantly predicts the outcome variable. A moderator does not explain why X affects Y as in mediation. Ex. Gender moderates the relationship between work experience and salary.
Random Sampling (& stratified
Random sampling refers to every individual/element in the population having an equal chance of being in the sample for participation. Random sampling enhances external validity and increases the likelihood of having a representative sample. In an example of simple random sampling, one would consider the population of all the university’s athletes, choose a sample size (N), assign a number to each athlete, and randomly select numbers, assuming that participant characteristics should be reasonably dispersed. To conduct stratified random sampling, the sampling frame/strata is determined by a chosen variable so the proportion of that characteristic is maintained by a quota matrix when sampling (e.g., ensuring race is proportional across all participants or to the U.S. population); it is used when researchers are trying to ensure that a sample has specific proportions of certain types of examinees.
Random Assignment
Random assignment allows researchers to balance individual differences between groups as it assumes randomly assigned groups will have the same personal characteristics (e.g., mixture of art skills will place mix of high and low creatives equally in both groups); however, special mind must be paid to situations in which individual differences would have naturally led to an effect (e.g., violent television research random assignment removes natural tendency of choice to watch violent television). Researchers might use simple random assignment procedures like assigning new participants to a pre-chosen random table of IDs to specific conditions. Matched random assignment may be used by pre-measuring participant characteristics and then balancing group membership afterwards (e.g., matching dogs based on breed and prior training).