METHODS Flashcards
Method definitions, theories, assumptions, limitations, procedures, tips and tricks, etc.
Demand Characteristics
When the participant changes their true response due to how they believe the experimenter wants it to go.
Expectancy Effects
When the experimenter’s biases cause them (consciously or unconsciously) to sway the experiment to match how they expect it to go.
Regression Discontinuity Design
A type of quasi-experimental design where you create an artificial control group by looking at people who fall close to a particular cutoff score, given that the cutoff score elicits the treatment. (ie, study outcomes of students with a grade of 79% compared to those at 80% who received a scholarship)
What are the 5 types of Quasi-experimental design?
Single group without a control Multigroup without a pretest Multigroup with control and pretest Time series designs Regression discontinuity
What are the 3 ingredients necessary for a randomized experimental design?
Random assignment Manipulates the independent variables Measurement of dependent variables
Grounded Theory of Causal Generalization
Shadish, Cook, and Campbell, 2002 1. Surface Similarity 2. Ruling Out Irrelevancies 3. Making Discriminations 4. Interpolation & Extrapolation 5. Causal Explanation
Multicollinearity
When there is a very high (or even perfect) correlation between independent variables. It’s a good indication that the data is not reliable. (Edwards 2008)
Construct Validity
the degree of correspondence between the constructs (i.e., latent traits) referenced by a researcher and their empirical realization (i.e., operationalization/measure); in other words, is it measuring what it claims to be measuring?(Stone-Romero, 2011)
Bias
systematic variance in an operational deginnition that is unrelated to the focal construct (Stone-Romero, 2011)
Unreliability
nonsystematic (random) variance in an operational definition (Stone-Romero, 2011)
Deficienct
when the constructs empirical realization (operationalization/measure) does not fully capture the essence of the construct (Stone-Romero, 2011)
Special Purpose vs. Non-Special Purpose Settings
More appropriate than a field vs lab distinction SP- created for the specific purpose of doing research (e.g., university labs, simulated work settings in an org); designed to allow for the effective (unconfounded) manipulation of one or more independent variables NSP- created for the purposes other than reserach (e.g., organizations, classrooms, churches, etc. )
Probability sampling
workers are selected from a population (with N members) in a way that ensures that the probability of selection a sample of a given sze (e.g., n=10) is equal to the probbilty of selection any other sample of the same size… I.E., all people in the population have an equal chance at being selected to participate– rarely occurs in IO for many obvious reasons
Startified random sampling
a researcher specifies strate of employees within an org (e.g., managerial, nonmanagerial) and then randomly select a specific number of workers from each stratum of interest
multistage cluster sampling
a researcher a) selects a random sample of units in one or moer clusters (e.g., private vs. public-sector, profit vs. non profit) and then b) randomly selects clusters of orgs from each larger cluster
nonprobabilty sampling
uses non random strategies for selecting sample members from a target population e.g., convenience sampling, purposive sampling of heterogeneous instances, systematic sampling, quota sampling
convenience sampling
selecting a nonrandom sample based on their availability to to participate in a study (i.e., intro to psych students)
purposive sampling of heterogeneous instances
selecting a nonrandom sample of members on the basis that the researcher believes that they are diverse in terms of characteristics that might influence a causal relation between variables (Shadish et al., 2002)
systematic sampling
select sample members in a methodical way from lists of assumed population members (e.g., every 10th person in the SIOP directory)
quota sampling
selecting specific numbers of sample members of different types to produce a sample that is roughly representative of the target population
Critiques of IO rigorous reserach
a) has little to no practical utility b) does not inspire confidence in the work of scientiests c) distorts important aspects of reality in orgs d) fails to consider phenomenon that are not readily measurable e) has low releance to the criteria that are important to practitioners f) doesn’t offer much more than common sense/common knowledge g) it is artificial/ irreleant/trivial j) has low operational validity (aka variables that can’t be controlled by practitioners realistically or ethically k) focuses on “proving” the truth of hypotheses and not problem solving l) untimely- deals with issues that change more rapidly than the knowledge it produces m) involves participants who are not representative of the populations to which generalizations are to be made (Stone-Romero, 2011; Thomas & Tymon, 1982)
Critique of NSP Causal Claims
- harder to control extraneous and confounding variables even using random experiments or quasi-experiments - the claim that there is more mundane realism and thus more external validity doesn’t always hold up - typically involce nonrepresentative samples of subjects, settings, and operational definitions of manipulations/measures - Locke (1986) showed evidence that typically relations found in SP settins hold up in NSP settings.
Mundane Realism
Is it consistent with the every day realities (e.g., adding health benefits, vacation time, etc. to simulated orgs even if it is unrelated to the study); if a study lacks mundance realism, an effect found in SP settings may not generalize to NSP settings (Stone-Romero, 2011)
Experimental Realism
the manipulations/measures are operationally defined such that they have a suffidient impact on study participants, it is noticable, it is legitimate enough to show an effect; if it lacks this, models may fail to show support for causal connections between variables, even if there is one i.e., why manipulation checks are important (Stone-Romero, 2011)
3 Major Attributes of Randomized Experimental Designs
- Manipulation of IVs (gives confidence about temporal precedence- causes preceding effects 2. Random assignment of units (individuals, teams, orgs) to study conditions (gives confidence that results are not due to confounds; that conditions are equivalent) 3. Measurement of DVs (outcome measures must be reliable or effects may not be found)
Limitations of Randomized Experimental Designs
- You just can’t manipulate some variables (e.g., sex) 2. Some variables that can be manipuated, shouldn’t be on ethical grounds (e.g., mental health) 3. in NSP settings, may not be able to randomly assign units 4. in NSP it may not be possible to isolate units (spatially), so you can’t be positive about treatment effects 5. participants may refuse to be assigned to experimental conditions on a random basis 6. it may be unethical to withold beneficial treatments from units (i.e., control group receiving no training/treatment they should really get)
Threats to experimental design internal validity
Differential attrition across conditions, history, testing, resentful remodalization, and the interaction of two or more threats (Stone-Romero, 2011)
3 Attributes of Quasi-Experimental Designs
- Manipulation of IVs 2. Nonrandom assignment of units to conditions (why it doesn’t hold as much power to make causal inferences) 3. Measurement of assumed DVs ( assumed because the units are not randomly assigned to we can’t be sure the outcomes are caused by the IV, only assume so)
Nonexperimental Designs
quantitative & qualitative research
Correlational study
studies that consider relations between (among) variables shouldn’t be referred to as correlational studies because corrrelation is a statisical technique; not an experimental design
Nonexperimental Design Attributes
- Measurement of assumed IVs (not manipulating makes them be assumed &hard to prove temporal precedence) 2. Nonrandom assignment of units to assumed conditions (collecting data from participants with an assumed level of the IV that could have stemmed from various causes) 3. Measurement of assumed DVs (not randomly assigned so we can’t be sure the DVs are caused by the IVs)
Limitations of Nonexperimental Designs
You really can’t make causal claims/inferences because of the lack of internal validity and the inabilty to prove temporal precedence, control confounds (only able to confidently demonstrate correlation/covariation)
Design vs. Method
Research Design and Statistical Methods are independent; Studies of each type of design can use a variety of statistical methods to analyze the data.
Prediction vs. Causal Claims
Prediction is a statistical term meaning inforation on a set of predictor variables can be used to perdict the value of a ctierion of interest This does not equate to evidence of a cause and effect relationship.
Causal claims
Need to be able to prove temporal precedence, covariation, and rule out any alternative explanations (control for confounds) Internal validity is driven by experimental design and cannot be compensated with external validity. The use of causal models (e.g., regression, MLM, SEM) does not justify the use of causal claims/inferences
Most frequent IO designs
Nonexperimental studies (esp. quantitative studies) are most common and NSP (field) settings being most common (Stone-Romero, 2011) Quasi-Experimental designs seem to be the least common
“Ranking” of Validity
If a study does not have internal, construct, and statistical conclusion validity; the external validity is irrelevant.
4 types of Validity
Internal validity ( relating to the existense of cause-effect relationships between variables) Construct validity (correspondence of constructs with operationalizations) Statistical Conclusion validity (the correctness/confidence/trueth of statistical estimates derived from the study) External validity (the extent to which relations generalize across different settings, populations) (Stone-Romero, 2011)
Stone-Romero (2011) Suggestions for Minimizing Common Method Variance
1) obtain predictor and criterion data from different sources 2) separate the time at which predictor and criterion variables are measured 3) counterbalance the order in which variables are measured
Threats to Statistical Conclusion Validity
a) low statistical power (i.e., small sample sizes) b) failing to mett the assumptions of the stats test c) conducting a large # of stats tests using a Type I error rate tha is lower than the actual (effective) Type I error rate
Statistical significance & practical importance
They are not mutually excluse There are “non-sig” effects that are practically important There are sig-effects that have effect size so small they make them not practically important See (Wasserstein et al., 2019) for more on why p-values/statistical significance is problematic.
Why is MTMM problematic?
1) Assumes traits and methods are uncorrelated 2) No set “rules” for what is enough in terms of satisfying rules (Stone-Romero, 2011)
Detecting Method Variance
1) MTMM 2) CFA
4 Attributes of Qualitative Research
- Occurs in natural settings (most of the time) 2. data is derived from the participants perspective 3. should be reflexible (i.e., flexible, dynamic) 4. qualitative instruments, observation methods, and modes of analysis are not standardized; the researcher serves as the main research instrument
Typical Purpose of Qualitative Research
Theory generation or Theory elaboration; not as suitable for theory testing (Lee et al., 2011)
3 most common qualitative designs in IO
Case Studies Ethnographies (observations) in-depth interviews
Grounded Theory approach to Qualitative Research
on a regular, iterative basis it requires a) hypothesis generation from data b) testing hypotheses on new data that may elicit revisions in OG hypotheses c) testing revised hypotheses with new data continue this until there’s theoretical saturation or no new learnings occur (Lee et al., 2011)
Qualitiatve vs. Quantitative questions
- What is occuring? How is it occuring? What constructs should I use to explain it? 2. What is the prevalence of this? Does this generalize?
Qualitative limitations
- Very time intensive 2. Can be professionally risky (takes away from other research methods, takes so long, harder to publish) 3. some org scholars claim we have too much theory, but it is possible to test theory through qualitative means as well (Lee et al., 2011)
MTMM Rule for Showing Convergent Validity
Validity Diagonal must not = 0 and they should be sufficiently large (Campbell & Friske, 1959)
MTMM Rules for Discriminant Validity
- The validity diagonal > adjacent rows/columns in the hetero-trait/hetero-method triangles 2. The validity diagonal> corresponding heterotrait/monomethod triangles 3. the same pattern of trait interrelationships should be shown in all heterotrait triangles in both the monomethod blocks and heteromethod blocks (Campbell & Friske, 1959)
Calculating Method Variance with MTMM
heterotrait monomethod values- heterotrait heteromethod values = method variance (Campbell & Friske, 1959)
MTMM Tips and Tricks
- You can view heterotrait monomethod triangles as method factors– if its the highest correlation, it means method is mostly driving it
- the smaller the values in the heterotrait/heteromethod triangles, the more indepedence there is (whether it is the traits or the methods)
- if heterotrait/hetero method triangles are close to equivalent with validity diagonal that implies there is no discriminant validity
- if heterotrait/ monomethod traingles are about the same or close to equivalent to reliability values, that implies there is no discriminant validity
(Campbell & Friske, 1959)
MTMM logic summary
Measures of the same trait should correlate higher with each other than measures of different traits with different methods Ideally, the validity values should also be higher than the correlations among different traits measured by the same method ** these conditions are rarely met and therefore it is a subjective judgement in the end about what is “valid enough” (Campbell & Friske, 1959)
MTMM Assumptions
- Traits and Methods are independent/uncorrelated 2. You are interpreting validites with the reliabilty of methods in mind 3. Unreliable methods makes the whole procedure a bit pointless it is concerned with the adequacy of tests as measures of a construct, rather than the adequacy of a construct (Campbell & Friske, 1959)
Reliability and Validity in relation to agreement
Reliability is agreement between two efforts to measure the same trait through maximally simlar methods Validity is represented in the agreement between two attempts to measure the same trait through maximally different methods (Campbell & Friske, 1959)
Topics/Trends of 1917-1925
- Objectivity and methods to solve important problems
- Cognitive ability tests
- Interest in the exceptional
- beginning of statistical significane and prediction models
- sources of rating errors (e.g., halo (Thurston, 1920) and guessing)
- focus on psychometric properties, classification, and test equivalence (Cortina et al., 2017)
Topics/Trends of 1925-1945 (The Depression and WWII)
- Test scoring methods, test form equivalence, and cross-validation
- assessment of non-cognitive testing (e.g., social/emotional intelligence, values, interests)
- limitations of self report measures
- Shift to a hypothetico-deductive method (i.e., hypothesis testing and theory driven studies)
- properties of distributions (e.g., growth curves, Poisson)
- Reflections on the past: too much survey work, need to expand samples of interest (Cortina et al. 2017)
Topics/Trends 1946-1969
- Predicive power of personality measures (esp. MMPI)
- Social desirability and faking
- development of many new measures (OPQ, PAQ)
- literature reviews become common
- science of job attitudes
- statistical significance over effect size (lol a start of the issue)
- smaller font for method section (showing the shift to focus of theory at the expense of methods)
- Reflections on the past: we stopped studying the exceptional, decline in practitioner authorship, shortened time perspective for the field (Cortina et al. 2017)
Topic Trends 1970-1989
- Focus on measure development (job characteristics, work values, job involvement)
- IRT
- self vs. supervisor reports,
- # of points and anchors in a rating scale
- increased focus on methods-specific papers
- data-analytic innovations of meta-analysis and SEM (Cortina et al. 2017)
Topic/Trends of 1990-2014
- Methodological plurality (no single approach no matter its potency is the right answer for all questions/research)
- Broadening of methodological choices
- -introduction of methods-oriented journals -
- publication of reactive and prescriptive reviews of methodological practices
- realization that data analytic approaches do not occer a pancea to solve research issues -
- subgroup differences -
- level of analysis, aggregation, and MLM -
- importance of effect sizes
- restoration of method section font size (Cortina et al. 2017)
Suggestions for 2017 and Beyond
- Replication of research -understanding the exceptional, both good and bad -identifying inappropriate methods in the review process (method experts review each paper; pre-review before starting) - better research design and measurement - more specific theories that are testable rather than offerly vague ones that could never by fully validated (Cortina et al. 2017)
Critique of Maximum Likelihood Estimation for Validity
- Only uses sample derived information in making inferences/decisions about test validity; ignoring all relevant info accumulated in the past (Schmidt & Hunter, 1977)
Bayesian Approach for Validity Generalization
- utilizes sample data and prior knowledge in estimating validity & weighs each in proportion to its information value - prior distribution of variances is combined with sample info to create a posterior distribution - the mean (or mode) of the posterior distribution is taken as the estiate of the parameter of interest (i.e., test validity) (Hunter & Schmidt, 1977)
Why is Bayesian for Validity a good idea?
- bayesian priors used are based on data (i.e. not subjective) and incorporates all past relevant info
- the assumptiosn made about between study variance in criterion reliability and range restriction are conservative
- certain sources of error variance in the obtained distribution are not corrected for, making it even more conservative
- corrections made to the mean of the prior for average range restriction effects are “probably” conservative
- procedures provides a parsimonious, sophisticaed and tech sound solution for the overarching problem of validity generalization
- the model can be extended to provide an improved method of data analysis and decision making in criterion-related validity studies
- when slightly modiffied, it provides a tool that can lead to the establsihment of general principles about trait-performance relationships (Hunter & Schmidt, 1977)
Correcting for error variance due to sample size
Find the average sample size across published studies then take 1(N-3) to get the estimates variance due to the sample size (Hunter & Schmidt, 1977)
Notion for correcting for error variance (i.e., reliabilty and range restriction) in validity generalization
If you take a bunch of validity coefficients, standardized them into Z scores, and then subtract the error variance and the variance of the distribution of the validiy coefficient is 0- then essentially it shows there is no true variation in validation; therefore, if we correct these in the beginning we are more likely to find the true validity ((Hunter & Schmidt, 1977)
Correction Warning
corrected variance tends to overestimate rather than underestimate (Hunter & Schmidt, 1977)
Common Method Variance
variance that is attributed to the measurement model rather than to the constructs the measures represent (Podsakoff et al., 2003)
Effects of Common Method Variance
Method effects inflate the observed relationship when the correlation between the methods is higher than the observed correlation between methods -method effects; it deflates the observed relationships when the method correlations are lower than the observed correlation- method effects (Podsakoff et al., 2003)
Common method biases causes
common rater common measurement context common item context characteristics of the items themselves (Podsakoff et al., 2003)
Stages of the Response Process
Comprehension Retrieval Judgement Response Selection Response Reporting (Podsakoff et al., 2003)
Comprehension Stage Potential Method Bias
Item ambiguity
Retrieval Stage Potential Method Bias
measurement context, question context, item embeddedness, item intermixing, scale size, priming effects, transient mood states, and item social desirability
Judgement state potential method biases
consistency motif ( in attempt to increase accuracy in the face of uncertainty), implicit theories, priming effects, item demand characteristics, and item context-induced mood states
Response selection potential method biases
common scale anchors and formats; item context-induced anchoring effects
Response reporting potential method biases
consistency motif (in an attempt to appear rational), leniancy bias, acquiescence bias, demand characteristics, and social desirability
Two main ways to control for method biases
a) Study design b) statistical controls never rely on one or the other; a combo of both is best
Common Rater Effects
Consistency Motif Implicit theories (& illusory correlations) social desirability leniency bias acquiescence biases (yea-saying and nay-saying) mood states (i.e., PANAS) transient mood states
Item Characteristic Effects
Item social desirability item demand characteristics item ambiquity common scale formats common scale anchors positive and negative item wording
Item Context Effects
- Item priming effects
- item embeddedness
- context-induced moods
- scale length (shorter is better for fatigue/boredom, but may enhance the likelihood of previous items influencing responses)
- Intermixing (or grouping) of items or constructs on the questionnaire
Measurement Context Effects
Predictor and criterion variables measured at the same point in time predictor and criterion variables measured in the same location predictor and crierion variables measured using the same medium
Crique of Harman’s Single Factor Test
The use of EFA/CFA to see if the most variance loads on one factor is problematic because it assumes if there is method variance it would account for most/all the covariance which is a large and inappropriate assumption to make; it doesn’t control for the method variance; just identifies it * yet it is still one of the most widely used methods for detecting method bias (Podsakoff et al., 2003)
Partial Correlation Procedures for Method Bias
Partialling out social desirability as a surrogate of method variance Partialling out an unrelated “marker” variable that is unrelated to the construct of interest Partialling out a general methods factor compares the structural parameters with and without these factors to determine potential effects (Podsakoff et al., 2003)
Controlling for a measured latent methods factor
items are allowed to load on their theoretical constructs and a latent method factor (with its own measurement component) and the sigficance of the structural parameters are examined with and without the latent factor; the latent factor is assessed by a surrogate variable (e.g., social desirability) assumed to represent CMV (Podsakoff et al., 2003)
Controlling for an unmeasured latent methods factor
Items are allowed to load on their theoretical constructs and a latent common methods variance factor’ structural parameters are compared with and without the methods factor’ the variance of the responses is partitioned into trait, method, and random error (Podsakoff et al., 2003)
CFA x MTMM
Combines both CFA and MTMM to partiion variance into trait, method, and random permitting researchers to control both for method variance and random error when examining relationships between predictor and criterion variables (Podsakoff et al., 2003)
Correlated Uniqueness Model
Each observed variable is caused byonle one trait factor and a measurement error term; there are no methods factors; the model accounts for method effects by allowing the error terms of constructs measured by the same method to be correlated (Podsakoff et al., 2003)
Direct Product Model
this model assumes the trait measures interact multiplicativel with the methods of measurement to influence each observed variable. The model assumes stronger the correlation between the traits, the more intercorrelation between the traits will be infuenced by shared method biases (trait and method components can’t be separated in analysis can’t control for method factors while testing for relationships; does not test for interactions after first controlling for main effects) (Podsakoff et al., 2003)
Partial Correlation Approaches Disadvantages
fails to distinguish between method bias at the measurement level from method bias at the construct level ignored measurement error in the method factor only control for single source of method bias at a time ignores method x trait interaction - besides Harmon’s single factor solution, these solutions are the worst (Podsakoff et al., 2003)