306 final Flashcards
Pearson correlation
- linear and monotonic data only, ratio or interval data
- describes direction (positive/negative), form (linear), consistency (strength, degree of correlation) of a relationship
- correlation values are ordinal: r of .8 is not twice as strong as .4 (does not increase in equal increments)
Spearman correlation
- Spearman rho (rs) used for ordinal data
- for monotonic non-linear data
- compute the rank order of scores (order from smallest to largest, assign a rank, plot the ranks which corrects for nonlinearities)
- you need at least 5 pairs of data (ideally more than 8 pairs)
- Spearman correlation will be 1 when data are monotonically related even if not linear (will pick up on this relationship better than Pearson)
- less sensitive to outliers than Pearson (ranks cannot be outliers because they will fall into the same range as the rest of the data)
monotonic
two variables increase or decrease or stay the same together, but the changes are not consistently the same size
point-biserial correlation
- when one variable is non-numerical and has two levels, you can convert those levels into 0 and 1
- the sign is meaningless, the idea of a linear correlation is meaningless
chi-square test
- both variables are non-numerical, so you organize the data into a matrix according to the frequency of individuals in each cell
phi-coefficient
- both variables are non-numerical and each have two levels - code them both as 0 and 1
- sign and linearity are meaningless
coefficient of determination
- r2: how much variability in one variable is explained by its relationship with the other variable (shared variance)
- like a Venn diagram showing the degree of overlap (the more overlap, the stronger the relationship)
- if only given the r2 value, you can tell the strength of the correlation (square root), but not the direction (a square root can be negative or positive)
what can you use the correlational strategy for
- predictions about future behaviour or the other variable (using regression, a predictor variable and a criterion variable)
- determine test-retest reliability and concurrent validity
- often used for preliminary research to indicate further research is required
problems with correlational strategy
- third-variable problem: an unidentified variable is controlling the levels of both variable
- directionality problem: cannot determine which variable is the cause and which is the effect
small, medium, large correlations
- no relationship: 0 - 0.1
- small/weak: r = 0.1 - 0.3, r2 = 0.01
- medium/moderate: r = 0.3 (or 0.3 - 0.7), r2 = 0.09
- large: r = 0.5 (or 0.7 - 1), r2 = 0.25 (or 0.49)
multiple regression
- for multivariate relationships: an individual variable relates to a multitude of other variables
- one criterion variable can be better predicted by a set of variables than just one at a time (but you can still examine the individual relationships)
correlational strategy
- describes the direction and degree of the relationship between two or more variables
- data collected is only measured, not manipulated (observations, surveys, physiological)
- high external validity
- unit of analysis is often time or a person - measuring both X and Y
- if data is numerical, represented in a scatterplot (each point is independent; one point per unit of analysis) with a regression line (line of best fit)
spearman rho interpretation
- weak: 0.21 - 0.41
- moderate: 0.41 - 0.60
- strong: 0.61 - 0.80
- very strong: 0.81 - 1.00
what relies on monotonicity?
- both Pearson and Spearman
what relies on linearity
- Pearson
which is robust to outliers
- Spearman
outliers
- data point that differs significantly from others in the set
- defined by the variance in the variable (2-3 standard deviations from the mean)
correlation significance
- typically p < 0.05 (less than 5% chance that this result is due to chance alone)
- df = n - 2 (number of variables)
- consult a table: to be significant r must be equal or larger than the corresponding value of df and alpha level
- small sample sizes are prone to producing larger correlations, so the criteria for statistical significance are more stringent (when n = 2, r is always +/- 1)
- if the sample size is larger enough, a small r could still be meaningful (in this case, evaluate for practical significance)
practical significance
- related to meaningful, real-world consequences of the observed correlation
advantages of correlations
- quick and efficient
- often the only method available (for practical or ethical reasons)
- high external validity (reflects natural events being examined)
limitations of correlations
- cannot say why two variables are related (correlations are often misinterpreted by the media which assumes causality)
- low internal validity
- very sensitive to outliers
- directionality and third variable problem
experimental research
- establishing a cause-and-effect relationship between two variables
- manipulation: changing level of the IV to create two or more treatment conditions (allowing us to determine directionality and causes through temporal ordering)
- control: all extraneous variables must be controlled to ensure they can’t be responsible for the change in DV (ensuring they don’t become confounds and produce a third variable problem - internal validity)
- extraneous variables only become confounds if they have an effect on the DV and vary systematically with the IV (only focus on controlling the important variables)
- creating an artificial/unnatural situation
hold constant method
- active method of control
- standardizing environment and procedures means that those variables are held constant for all participants (if they don’t vary, they can’t become confounds)
- often limiting to a range (like ages) for practical reasons
- limits external validity because you won’t be able to generalize beyond this range of values
matching method
- active method of control
- match subjects on pre-existing variables that may be related to differences in DV
- across levels of IV (make sure ages are balanced across treatment conditions or make sure the average ages are the same)
- can also be used to control environmental or time-related factors (counterbalancing)
- can be time consuming and impossible for all extraneous variables
randomization method
- passive method of control
- disrupting any systematic relationship between extraneous and independent variables by distributing extraneous variables across treatment conditions using a random process (random assignment)
- can also be used for environmental variables
- makes it unlikely to see systematic relationships between variables, but not impossible (small samples are more likely to be biased, should create groups of at least 20 participants per condition)
- variables that are especially important should be held constant or matched
control conditions
- no-treatment (waitlist control) or placebo control (for showing effects beyond the placebo effect)
- not a necessary component to be considered experimental research (but control of extraneous variables is an essential component)
manipulation check
- measures whether the IV had the intended effect on the participant (how they interpreted/perceived the intervention)
- could be an explicit measure of the IV, or part of the exit questionnaire
- especially important in participant manipulations (was the intervention successful), subtle interventions (if it could go unnoticed by the participant), placebo controls (Ps must believe the placebo is real), simulation of a real-world situation (depends on their perception and acceptance)
conditions for establishing causality
1) time order: IV occurred before the effects in DV
2) co-variation: changes in the IV value must be accompanied by changes in the DV value
3) rationale/explanation: logical and compelling reason for the two variables being related (the mechanism behind the association)
4) non-spuriousness: only the IV caused the changes found in DV; rival explanations must be ruled out
between-groups design
- two or more samples/groups are formed randomly (each group composed of different participants) and assigned randomly to a different condition (level of IV)
- participation in only one level = one score on DV per participant (independent measures)
- lots of individual differences = more variability in the scores and potential confound (significant threat to internal validity)
- uses systematic and non-systematic variance to calculate the F-ratio (treatment index)
- use randomization, matching, holding constant to make sure groups are as similar as possible (distribute characteristics equally between groups, holding constant restricts external validity)
within-groups design
- only one sample/group is formed and each person participates in all conditions, which can be administered sequentially or all at once (values are compared across conditions within participants)
- possibility of time-related and environmental threats to internal validity
four basic elements of the experimental design
1) manipulation: creating conditions (levels of IV) to determine the direction of the effect and helps control the influence of other variables (ensuring that the IV isn’t changing with other another variable)
2) measurement (not unique to experiments)
3) comparison (not unique to experiments)
4) control: ruling out alternative explanations for changes in DV by not letting extraneous variables become confounds (if the variable affects all conditions equally, it’s just an extraneous variable - if only one condition is affected and it’s possible the variable is affecting the DV, it’s a confound)
categories of extraneous variables
- environmental: testing environment, time of day, etc.
- participant variables: gender, age, personality, IQ, etc.
- time-related variables: history, maturation, instrumentation, testing effects, regression to the mean
ways to control possible confounds
- remove them (not possible for all variables)
- hold them constant (include them in all conditions, can limit external validity)
- use a placebo control (or waitlist): if the experimental method itself is a confound (like delivering medication by injection - inject saline to the control group)
- match them across conditions (can limit external validity): creating balanced groups, averages, counterbalancing
- randomize them: powerful for controlling participant and environmental variables all at once, rather than individually
possible reasons an experiment didn’t work
- IV isn’t sensitive enough (not a wide enough variety of conditions)
- DV isn’t sensitive enough (scale not sensitive)
- IV/DV have floor or ceiling effects
- measurement error: methods you used are prone to error (you can control this)
- insufficient power: not enough participants
- hypothesis is wrong: compare with other studies
threats to internal validity in experiments
- history: a current event affected change in the DV
- maturation: changes in DV due to normal development processes
- statistical regression: subjects came from low- or high-performing groups, and subsequent testing generated scores closer to the mean
- selection: self-selected or randomly assigned?
- experimental attrition: more people dropped out of one condition than the other
- testing: previous testing affected behaviour at later testing (should counterbalance)
- instrumentation: measurement method changed during research
- design contamination: participants changing their behaviour according to their own hypotheses
other threats to external validity in experiments
- unique program features: experimenter in one condition creating a unique environment that isn’t present in the other condition
- effects of selection: was recruitment and assignment to conditions successful
- effects of environment/setting: can the results be replicated in other labs/environments
- effects of history: can the results be replicated in different time periods
external validity strategies in experiments
- simulation: trying to bring the real world into the lab (mundane realism: how close the lab environment is to the real world / experimental realism: only bringing the psychological aspects of the situation to create immersion)
- field studies: bringing the experimental strategy into the real world (examining bx that are difficult to replicate in a lab)
- strengths: testing hypotheses in realistic environments
- limitations: field studies make it difficult to control all extraneous factors, simulations are dependent on whether the participant believes the simulation is real
original simulations
- pre-virtual reality
- Ps have a preference for hypothetical situations as IVs (vivid descriptions) and qualitative self-reports as DVs
current-day simulations
- VR headsets
- using realistic immersive stimuli as IVs and quantitative response measures as DVs
- VR influences emotional responses (will be more similar to the emotional response you might see in the real world)
perils for experimental designs
- much current research is atheoretical, but without theories, hypotheses are made ad hoc which can be illogical or meaningless
- many measurement instruments have not been tested for their reliability and validity and can be incomparable across studies - use pre-validated tasks when available
- sometimes uses inappropriate research designs = lacking internal validity - should conduct pilot tests with small samples to ensure the roles of IV and DV
- conditions or tasks may be inappropriate = threat to external validity (other participants would have responded differently) because you can’t compare across studies or generalize - instead use simple and familiar tasks
- should always conduct manipulation checks
individual differences effect on variability
- F ratio compares between-group differences to variances
- with large individual differences, variance increases which can obscure a significant result
- big differences between groups are good (treatment effect), but big differences within groups are bad (variance)
ways to reduce within-group variance in between-group designs
- standardize procedures, keep environment constant
- limit individual differences by creating a more homogeneous group (reducing variance and threat of confounds, but also limits external validity - instead, use a factorial design)
- random assignment DOES NOT affect within group variance
- a large sample size can reduce variance (but does so in relation to the sqrt(n) so you need a very large increase)
systematic variance in between-groups
- difference in DV between groups
- composed of treatment effect + experimental error (determine if between-group DV differences are due to only experimental error)
non-systematic variance in between-groups
- any differences between subjects who are treated alike
- scores varying within groups (individual differences occurring by chance)
- source of error to be minimized
- experimental error: chance factors not being controlled
treatment index for between-groups
- F = (between-group variance) / (within-groups variance)
- F = (treatment + experimental error) / experimental error
- F = systematic variance / non-systematic variance
- between-group variance > within-group = large and positive F ratio (good!)
- between-group variance < within-group = F ratio is near 0 (bad)
single-factor multiple-group design
- for comparing multiple groups with one IV (single-factor ANOVA)
- provides stronger causal evidence than a two-group design (looking at components individually)
major sources of confounds in between-groups
- as a function of the design
- individual differences: want to make sure that groups are as similar as possible, except the IV (assignment bias: process of assignment produces groups with different characteristics like age = threat to internal validity)
- environmental variables: subjects are only tested once in one set of conditions, those characteristics could differ between groups = extraneous variable can become a confound
limiting confounds in between-groups designs
- randomization: most powerful way to ensure groups are as equal as possible before treatment (spreading them evenly)
random sampling vs. randomization
- sampling: random selection of participants from a larger population
- randomization: assignment of participants to experimental or control groups in a study
free random assignment
- each P has an equal chance of being in any condition (like a coin toss)
- theoretically should lead to equal groups, but no guarantee
- improbable that groups will be perfectly matched, but small differences are random and insignificant (neutralizing nuisance differences and maximizing between-group differences)
- with small samples, no guarantees
4 steps for matching
- identify the variables to be matched (potential confounds)
- measure and rank participants on the variables (pretest)
- segregate subjects into matched pairs
- randomly assign pair-members to conditions
matching across blocks
- extended to units larger than pairs
- groups of individuals are matched in blocks
- random assignment into groups from blocks
threats to internal validity between-groups
- differential attrition: participants leaving one group at a higher rate than for another group (may be affected by individual differences like motivation or environmental like time of day) = groups are no longer equal
- communication between groups: diffusion - treatment effects spreading from one group to another, treatment effects masked by shared information, resentful demoralization - perceived inequity between groups
advantages of between-subjects
- simple designs (score are independent of each other)
- clean and uncontaminated by other treatment factors (no carryover or time-related)
- less time required for each participant
- establishes causality
disadvantages of between-subjects
- requires many participants (esp. if many treatment conditions)
- difficult to recruit from special populations
- individual and environmental differences
- generalization (external validity) if holding extraneous variables constant
- assignment bias, experimenter expectancy, subject expectancy (try to keep participants, experimenter, analyst blind)
when to use between-subjects designs?
- when carryover or comparison effects are expected
- when participants should be using similar anchoring across conditions
- when participants could become sensitized to measurement over time
- to conserve ecological validity when participants would normally be exposed to all levels of IV in real life
- if changes in measurement properties/tests over time are expected