Evaluating SIs Flashcards
What is ‘programme theory’ and how is it useful in the evaluation of social interventions?
Programme theory: the process by which we expect change to occur; our hypothesized pathway from intervention to outcome
This can be based on theories, empirical evidence, and often our own beliefs regarding what ‘sounds good’ (i.e., face validity)
The programme theory will be, in part, determined based on our understanding of the problem theory: What is the problem, for whom, why is it a problem, and what aspects of the problem are malleable (changeable)?
Clarifies assumptions of how the intervention works
Promotes understanding, encourages debate, and allows for evaluation
Helps evaluators think critically about assumptions and unintended consequences
Mapping the “programme theory” can help to identify key “micro-steps” that (we think) need to be achieved for the programme to have the anticipated outcome – “logic models”
A breakdown at any of the micro-steps may be enough to prevent the programme taking the anticipated effect
What are the challenges in conducting mediator and moderator analyses in social intervention research?
Mediator analyses:
o Ask how and why interventions work (mediator = link between)
Moderator analyses:
o Ask for whom interventions work (moderator = alters, modifies effects)
o Are there differential effects for different subgroups?
o Important implications for ‘equity’ of effects; key topic in public health (Petticrew, 2011)
Caveats to conducting mediator analyses:
o Even within trial, potential confounding – you haven’t randomised to the mediator, only to the whole package (but, at least there is lots of the mediator happening within the package; can also test a competing explanation)
o Mediator analyses often exploratory, not the main aim of the trial; study often not powered for this – need to proceed with caution
o Longitudinal – but need 3 time points to fully separate intervention, mediator and outcome for somewhat stronger causal inference
Long history of misgivings about moderator analyses in trials:
o “Only one thing is worse than doing subgroup analyses – believing the results” (Richard Peto)
o To prevent ‘cherry picking’ of results we need explicit pre-specification of hypotheses—confirmatory or exploratory—plus rationale (Rothwell, 2005; Wang and Ware, 2013)
o Need pre-registering – ORCHIDS a model for us all
Limitations of analyzing moderators in systematic reviews:
o Power still low: although total Ns may be higher, subgroups are coded at trial level, meaning all variability within trials of participant characteristics is masked (see Thompson and Higgins, 2005, Lancet)
o Cherry picking: risk lower?
o Moderators still confounded (Lipsey, 2003); one meta-analysis attempted to overcome this: Leijten et al., 2013, examined how SES moderator effects were confounded by other risk factors, such as problem severity
Conclusions: for whom do social interventions work and how?
o Secondary analysis of RCTs can answer questions about ‘For whom does it work’ (MODERATORS) and ‘How’ or ‘What are the critical ingredients’ (MEDIATORS)
o Can derive mediator hypotheses from theory, or from qualitative methods; e.g., users’ views on process (Bonell et al., 2012)
o Secondary analyses can be very useful, but important to be aware of their limitations – power and cherry picking, failure to replicate
o Can do systematic reviews of moderators, but NB meta-analyzing subgroups brings new problems—better to pool individual level data to give more power, and make use of full data on variability between individuals
Why are process evaluations necessary when evaluating social interventions?
“In order for evaluations to inform policy and practice, emphasis is needed not only on whether interventions ‘worked’ but on how they were implemented, their causal mechanisms, and how effects differed from one context to another” (Moore et al., 2014)
Process evaluation = study aiming to understand the functioning of an intervention, by examining:
o Implementation: structures, resources and processes of delivery (fidelity, uptake, adaptations, dose, reach)
o Mechanisms of impact: how intervention activities, and participants’ interactions with them, trigger change
o Context: external factors influencing the delivery and functioning of interventions (culture, economic context, infrastructure, etc.)
o Research processes: randomization, spill-over, etc.
Process evaluations are complementary to outcome evaluations – they are not a substitute
Key functions of process evaluation (Moore 2015)
WE NEED PROCESS EVALUATIONS... To understand (in)effectiveness o It is not enough to conclude that an intervention ‘works’ or ‘does not work’ without needing to delve deeper – was the programme effective for everyone? o Part of applying rigorous scientific methods is to allow for replication studies to confirm (or contradict) findings – cannot do this if we are unsure what was even implemented in the first instance
To avoid Type III Error
o Drawing conclusions about ‘what works’ when the data is not based on implementation of the intended programme
o Need to be certain that the intended programme was actually delivered before subsequently drawing any conclusions about how effective it was (or wasn’t)
To accumulate evidence in systematic reviews
o Treating intervention and control conditions as monoliths
o We need to be aware of and incorporate implementation heterogeneity into our analyses and cannot do this if the information is not collected and reported
o Opens up opportunities to examine differential effects – e.g., moderator analyses for various key subgroups
When should we conduct process evaluations?
o During feasibility studies
o Alongside effectiveness evaluations
o While scaling-up efforts of interventions with evidence to support their effectiveness
Why are randomized controlled trials criticized as an approach for evaluating social interventions?
Definition of Randomized Controlled Trials (RCTs):
o A planned intervention study in which each member of a study population has the same chance of receiving one or more experimental or control treatments
o Randomisation is the only unique feature of RCTs
o Randomisation can be achieved by any procedure that assigns people (or units) to conditions based on chance alone
Equipoise (a key feature of RCTs): Genuine uncertainty about the relative merits of the treatments being compared (Freedman, 1987); unethical otherwise
Easier to satisfy equipoise under certain conditions:
o Demand outstrips supply
o When many participants express no preference
o When lottery is expected
Important design features – randomization:
o Allocation to the intervention and comparison groups should be unbiased with respect to prognosis and responsiveness to treatment; it is not determined by the investigators, the clinicians/practitioners, or the study participants
o The measured and unmeasured, known and unknown prognostic factors and other characteristics of the participants at the time of randomisation will be, on average, evenly balanced
o Thus, any threats to internal validity are evenly distributed across conditions; alternative explanations for the outcome are implausible, as these cannot be correlated with treatment condition
Disadvantages of RCTs:
o Requires rigorous control of the allocation process
o Can be long and/or expensive
o May not be ideal for rare conditions or problems with a long latency
o Generalisability – depends on who is recruited and how
o Beware the volunteer – very often not representative of the wider population
o But consider, what is the alternative? Other methods may suffer from same drawbacks (or not)
RCTs frequently criticized for not…
o Opening the ‘black box’
o Being sufficiently informed by theory
o Exploring transferability of evidence
What impact does specification of intervention components have on any subsequent evaluation?
Specifying social interventions:
o What are the interventions components?
o What is the theory of change?
o Can you draw out the programme’s logic model?
Is there a clear plan of intervention (i.e., can it be manualized)?
Intervention components: unique?
o Specific components: Those parts of the treatment unique to the particular programme
o Non-specific components: Those parts of the treatment common to other types of programmes
Intervention components: centrality?
o Core components (also called ‘active ingredients’): Primarily responsible for an intervention’s outcomes
o Allowable components: No evidence to demonstrate causal link with outcomes; still included depending on circumstances
Intervention components: harm? o Proscribed (prohibited) components: Possibly contribute to negative effects; rarely listed, but sometimes implied when describing intervention
Intervention components: form or function?
o Components as activities (form) – the specific action that occurs (e.g., the action of praising a child; small-group training sessions)
o Components as functions – the targeted mechanism underlying the action (e.g., positive attention and relationship building; normalizing occupational insecurities and reducing isolation)
o In reality, components are often a combination of both: an activity is only relevant to the extent that it fulfills a function
EXAMPLE: STUDIO SCHOOLS
What are Studio Schools (briefly)?
o New model of education in England
o Aim to contextualize learning and make it more practical
o Outcomes are engagement with education and employability of young people (14-19 years old)
Specifying the components of Studio Schools o Core: - Project-based learning - Personal coaching sessions - Work placements - Small school environments - Longer school day and year o Allowable: - Opportunities to start a business or project - Self-study units - Taught subject lessons
Why is the economic evaluation of interventions and policies important? How does it benefit policy and practice?
More information is better
Multiple ‘good’ options in the face of scarcity of resources
Vertical and horizontal equity
o Vertical equity = different groups with different levels of need should receive different levels of resources
o Horizontal equity = groups with equal levels of need should receive equal levels of resources
Naming alternatives and explicitly considering them:
o …what are the alternatives?
o …what is the perspective?
o …what does the economic evaluation tell us that an ‘educated guess’ won’t? (Drummond et al., 2005)
As much about evaluating policy as setting it
Analysis is always comparative
There should be evidence of effectiveness for the key policy or intervention under examination
How should researchers evaluate the potential harms of social interventions? Discuss with reference to one area of social intervention.
‘First do no harm’ - is it possible to ‘do no harm,’ and how would we know?
o Are good intentions sufficient for effective outcomes?
Does the intervention work or not?
o Wrong question to be asking – should be extended to specify those particular outcomes/populations we are interested in, as compared to the control group
o “Unless social programs are evaluated for potential harm as well as benefit, safety as well as efficacy, the choice of which social programs to use will remain a dangerous guess” (McCord, 2003)
Dark logic models
o Ethical imperative to consider the potential for harm in advance (a priori) and to design our evaluations to appropriately assess for any such iatrogenic outcomes or mechanisms
In her examination of the negative outcomes of the Cambridge-Somerville Youth Study and a host of other crime prevention programmes shown to ultimately cause harm, Joan McCord advocates an approach to assessing social interventions that takes into account more than mere efficacy, also looking into the safety measures and possible iatrogenic effects of a programme, her aim being to demonstrate that simply asking whether an intervention ‘works’ fails to adequately capture crucial considerations related to propensity for harm (McCord, 2003)
Should researchers fail to systematically collect, critically appraise, and appropriately analyze all the available data concerning a question of interest, harm may very well be introduced into subsequent social interventions through sheer ignorance of what presently constitutes best practice
Alongside limitations in the existing state of knowledge, ignorance constitutes a driving factor of ‘unexpected consequences of conduct’ in purposive social action, as defined by Robert K. Merton (Merton, 1936)
By prohibiting the use of a placebo in cases where there already exists a demonstrably effective alternative, equipoise reduces the potential for ‘doing harm’ to members of a trial’s control group
Throughout the life of an intervention, designers and implementors must remain cognizant of the various types of harm that could emerge
Theo Lorenc (2014) proposes the following typology of harm:
o Direct harms, in which the outcomes desired are directly associated with adverse effects
o Psychological harms, in which an intervention yields negative mental health impacts
o Equity harms, in which an intervention worsens existing social inequalities
o Group and social harms, in which harm is generated by the singling out or bringing together of a certain group
o Opportunity harms, in which, by favoring a particular intervention over others, we forfeit claim to any potential benefits associated with alternative courses of action
Ethical obligations, coupled with pragmatic concerns of cost-effectiveness, demonstrate a clear need for formulating dark logic models, regularly conducting and consulting systematic reviews, and maintaining transparency
How is theory used in the evaluation of social interventions? Discuss using examples.
Using theory to help explain behavior…
o Can be called a ‘theory of the problem’
o It helps to understand the nature of the problem, why it might exist and under what conditions
o What are the targets for triggering change?
Using theory to explain changes in behavior…
o Can be called theories of change or theories of action
o Form the basis of evaluation – helping researchers make their assumptions explicit about how the intervention should work
o Related to programme theories, implementation theories and logic models
How interventionists use theory:
o “It should be possible to construct and justify theory-based form of evaluation that complements experiments…It would prompt experimenters to be more thoughtful about how they conceptualise, measure, and analyse intervening processes. It would also remind them of the need to first probe whether an intervention leads to changes in each of the theoretically specified intervening processes…” (Cook, 2000)
o “From Popper’s work, we recognize the necessity to proceed less by seeking to confirm theoretical predictions about causal connections than by seeking to falsify them. For Popper, the process of falsification requires putting our theories into competition with each other” (Cook and Campbell, 1979)
The importance of theory in evaluation:
o RCTs provide relatively simplistic tests of theories – controlled design requires relatively few assumptions (inputs, outcomes)
o This approach to evidence generation is often orientated towards accreditation of interventions rather than tests of causal theories (Bonell, 2012)
o But for social interventions causal pathways may not be straightforward and allocation to interventions may be uncontrolled
o We need to know how interventions work
In the absence of theories: null findings
o Martinson (1974) “What works? Questions and answers about prison reform” The Public Interest, 35, 22-54
o Review of rehabilitative interventions for reducing recidivism
o Widely interpreted as “nothing works” in prison rehabilitation
o Led some to criticize the investment of resources in prisoner rehabilitation
o But led others to criticize the methodological status quo and ask why don’t rehabilitative programmes work?
Key principles: opening the black box
o Is the theory on which the policy/programme is based incorrect?
o Was there implementation failure?
o Was the strength (e.g. dose) of the intervention insufficient?
o Was the measurement of the treatment not sensitive enough?
o Etc…
What is theory-based evaluation?
o Definition #1: “Theory-driven evaluation is…any evaluation strategy or approach that explicitly integrates and uses stakeholder, social science, some combination of, or other types of theories in conceptualizing, designing, conducting, interpreting, and applying an evaluation” (Coryn et al., 2011)
o Definition #2: “It helps to specify not only the what of a programme outcomes but also the how and the why. Theory-based evaluation tests the links between what programmes assume their activities are accomplishing and what actually happens at each step along the way” (Weiss, 2000)
What is theory-based evaluation (TBE): summary
o Programmes and policies are based on theories – whether explicit or implicit
o Mapping the “programme theory” can help to identify key “micro-steps” that (we think) need to be achieved for the programme to have the anticipated outcome – “logic models”
o The causal linkage of these micro-steps may be long or short, depending on our existing understanding
o A breakdown at any of the micro-steps may be enough to prevent the programme taking the anticipated effect
o TBE is the process of evaluating programme outcomes whilst simultaneously testing hypotheses about the causal processes
Creating “elaborate theories”
o Conversation between William Cochran and Ronald Fisher – how to move from association to causation
o Fisher: “make your theories elaborate”
o The more elaborate a causal theory the better equipped you are to test the theoretical assumptions against the observational data
o Patterns of agreement (or disagreement) between data and theory can help validate, falsify and optimize a causal theory
EXAMPLE: Limiting the physical availability of alcohol to reduce alcohol-related harm
o “Efforts to control alcohol availability to reduce alcohol-related harms have been based on the view that ‘less is best’; i.e. the less alcohol available the better for public health and safety” (Stockwell and Gruenewald, 2004)
o “Availability theory” – 3 related propositions:
- (1) The greater the availability of alcohol, the higher the average consumption of alcohol
- (2) The higher the average consumption, the greater the number of excessive drinkers
- (3) The greater the number of excessive drinkers, the greater the prevalence of health and social problems
o Prevention might include:
- Placing restrictions on number of premises (e.g., spatial availability)
- Placing restrictions on the times at which alcohol can be sold (e.g., temporal availability)
- Placing restrictions on consumption for population groups (e.g., minimum legal drinking age)
Alcohol licensing reform: The Licensing Act (2003)
o Statutory aims:
- (a) To reduce crime and disorder
- (b) To enhance public safety
- (c) Prevent public nuisance
- (d) Protect children from harm
o Method:
- A range of procedural/bureaucratic changes
- Removal of fixed trading hours
- Option to remain open 24 hours a day
o Humphreys and Eisner (2014) found no real effect – the Licensing Act did not cause bars and nightclubs to change their trading hours a great deal
Without conducting a process evaluation, all outcomes from an intervention trial are meaningless. Discuss this statement.
Process evaluation = study aiming to understand the functioning of an intervention, by examining:
o Implementation: structures, resources and processes of delivery (fidelity, uptake, adaptations, dose, reach)
o Mechanisms of impact: how intervention activities, and participants’ interactions with them, trigger change
o Context: external factors influencing the delivery and functioning of interventions (culture, economic context, infrastructure, etc.)
o Research processes: randomization, spill-over, etc.
Process evaluations are complementary to outcome evaluations – they are not a substitute
“Process evaluations, which explore the way in which the intervention under study is implemented, can provide valuable insight into why an intervention fails or has unexpected consequences, or why a successful intervention works and how it can be optimised. A process evaluation nested inside a trial can be used to assess fidelity and quality of implementation, clarify causal mechanisms, and identify contextual factors associated with variation in outcomes” (Craig et al., 2008)
Updated guidance from the Medical Research Council (MRC) emphasizes the importance of conducting process evaluations within intervention trials, stating these evaluations “can be used to assess fidelity and quality of implementation, clarify causal mechanisms and identify contextual factors associated with variation in outcomes” (Craig et al., 2008)
“An intervention may have limited effects either because of weaknesses in its design or because it is not properly implemented. On the other hand, positive outcomes can sometimes be achieved even when an intervention was not delivered fully as intended. Hence, to begin to enable conclusions about what works, process evaluation will usually aim to capture fidelity (whether the intervention was delivered as intended) and dose (the quantity of intervention implemented). Complex interventions usually undergo some tailoring when implemented in different contexts. Capturing what is delivered in practice, with close reference to the theory of the intervention, can enable evaluators to distinguish between adaptations to make the intervention fit different contexts and changes that undermine intervention fidelity” (Moore et al., 2015)
“Complex interventions work by introducing mechanisms that are sufficiently suited to their context to produce change, while causes of problems targeted by interventions may differ from one context to another. Understanding context is therefore critical in interpreting the findings of a specific evaluation and generalising beyond it. Even where an intervention itself is relatively simple, its interaction with its context may still be highly complex” (Moore et al., 2015)
What is the value of conducting analysis of mediators and moderators in trials of social interventions?
Designers of social interventions aspire to the “gold standard” of randomized controlled trials (RCTs) largely because, through their inherently random allocation of subjects to either a treatment or control group, RCTs maximize the likelihood that any observed effects can be attributed to the intervention in question, the assumption being that the process of randomization distributes potentially confounding variables equally to both the treatment and the control group
However, even when conducting RCTs—an experiment type widely praised for its ability to effectively reduce bias and confounding—we should seek to supplement our confidence that an intervention “works” with an understanding of the underlying mechanisms that together culminate in the resulting effectiveness, crucially entailing the methodical analysis of mediators and moderators
Mediator analyses elucidate how and why an intervention works, identifying the “active therapeutic components” of a treatment, and ultimately allowing us to refine treatments so as to maximize effectiveness and minimize cost
Moderator analyses ask for whom an intervention proves effective, seeking to uncover any differential subgroup effects, some of which may be contributing to the exacerbation of existing social disparities
Comprehending how, why, and for whom an intervention works better elucidates the precise nature of the causal relationship under study, while also carrying vital—and otherwise obscured—implications for our collective knowledge of the ‘equity’ of intervention effects
Insofar as we wish our research to meaningfully inform practice, we must take care to unpack the ‘black box’ that often camouflages the causal processes and active ingredients driving an intervention, and devote the time, energy, and resources necessary to establish the existence of any differential subgroup effects prior to the widespread implementation of an ‘effective’ intervention, so as to minimize the risk of unknowingly inflicting harm on a large scale, and ultimately maximize efficiency
VALUE OF MEDIATOR ANALYSES
o Operating under the basic assumption that a causal relationship exists between an intervention and an observed outcome—an assumption made reasonable in the context of RCTs by the process of randomization—we can conceptually frame a mediator as an intervening variable on the causal pathway of an intervention, responsible for somehow shaping the relationship between stimulus and response (Baron and Kenny, 1986)
o To understand an intervention’s mediating variables is to understand how and why an intervention produces the effect(s) it does
o By striving to fully appreciate “the mechanisms through which treatments operate,” we find ourselves better equipped to maximize treatment effectiveness, and simultaneously more able to reduce the monetary and human costs associated with a given treatment—“Active therapeutic components could be intensified and refined, whereas inactive or redundant elements could be discarded” (Kraemer, Wilson, Fairburn, and Agras, p. 878)
o In addition to shedding light on the relative utility of an intervention’s constituent ingredients, mediator analyses provide valuable insight into the very nature of medical disorders and social phenomena, knowledge which is then utilized to develop more acutely targeted treatments
o Notably, when evidence emerged to suggest that cognitive behavioral therapy (CBT) works in treating panic disorders via the eradication of catastrophic thoughts related to bodily changes—a mediating mechanism—the cognitive theory of panic gained greater empirical substantiation (Clark, 1997)
VALUE OF MODERATOR ANALYSES
o An analysis of moderating variables seeks to resolve the question of for whom an intervention works; in particular, moderator analyses evaluate whether participant outcomes differ in accordance with baseline characteristics such as socioeconomic status, ethnicity, gender, and/or experienced severity of the problem under study
o Differential subgroup effects can manifest in a variety of ways, with an intervention potentially suitable for certain subgroups, less effective for others, and actively harmful to yet another segment of the population
o The field of public health in particular produces a significant body of research primarily concerned with examining the ‘equity’—or lack thereof—of intervention effects (Petticrew, 2011)
o When the incongruous effects of a treatment materialize so as to either disproportionately benefit an already advantaged subgroup or disproportionately harm a traditionally underprivileged group, then the treatment in question may in fact serve to further exacerbate existing social disparities, engendering criticism on a social justice front
o Famously, a preponderance of evidence strongly suggests that media campaigns targeted at reducing cigarette use prove most effective at achieving their aim among the socioeconomically well-off, thereby widening already established inequalities (Lorenc et al., 2013)
o To the extent that we care about the broader societal implications of our interventions, rather than simply whether or not they ‘work’ in the most rudimentary sense, moderator analyses become an invaluable tool
o Moderator analyses also present a means by which to capitalize on the increasing interest among policymakers in promoting more targeted, highly tailored interventions, with the initial investment incurred in developing a well-founded knowledge base of what works for whom seen as more than justified by the resulting ability to maximize efficiency and minimize risk
o The considerable influence wielded by intervention research pertaining to the topic of subgroup analysis extends to “policy decisions around programmatic aims (e.g., Upward Bound), funding decisions (e.g., Even Start), and new initiatives targeting funding towards evidence-based programs (e.g., teen pregnancy and home visitation)” (Supplee et al., 2013, p. 107)
o Research should not exist in a vacuum, and with the present demand for subgroup-specific interventions unlikely to abate in the foreseeable future, moderator analyses represent an especially promising avenue for bridging the gap between research findings and clinical practice
Upon identifying inequitable effects of an intervention through moderator analyses, we can work to better understand why certain subgroups respond more positively than others to a treatment by investigating the potential mediating mechanisms at play
When conducted in conjunction with one another, moderator and mediator analyses can maximize the equity of an intervention without necessarily compromising its effectiveness
Why should researchers undertake a phase of pilot testing when evaluating complex social interventions?
Complex interventions – see the Medical Research Council (MRC) Framework
What makes an intervention complex?
o Number of interacting components within the experimental and control interventions
o Number and difficulty of behaviors required by those delivering or receiving the intervention
o Number of groups or organizational levels targeted by the intervention
o Number and variability of outcomes
o Degree of flexibility or tailoring of intervention permitted
Social interventions tend to be complex: o Several interacting components; levels of service delivery o Involvement of stakeholders o Multiple and variable outcomes, with different measurements – child-parent report, observation/self-report o Mediators (mechanisms) and moderators at individual, family, community, societal levels
“Best practice is to develop interventions systematically, using the best available evidence and appropriate theory, then to test them using a carefully phased approach, starting with a series of pilot studies targeted at each of the key uncertainties in the design, and moving on to an exploratory and then a definitive evaluation” (Craig et al., 2008)
MRC Framework, stage 2: Feasibility and piloting
o Testing procedures [for the intervention and all research processes]
o Estimating likely resources, recruitment and retention
o Participant acceptability and satisfaction
o May include pre-post study, or small RCT
o Can be qualitative and quantitative
CASE EXAMPLE: WHO ‘Parenting for Lifelong Health’ - Parenting interventions to reduce risk of child maltreatment in low and middle income countries (LMICs)
South Africa: The Sinovuyo Caring Families Programme for Parents of Children Aged 2-9 Years
o To test feasibility using mixed methods:
- Dosage/exposure
- Programme fidelity
- Participant satisfaction, cultural feasibility
- Pilot RCT
o Pilot program delivered to 56 parents in 4 groups
o Parent interviews at participants’ home:
- Random sample (intervention, n=11; control, n=4)
- 1 hour; trained research assistants with interpreter
o Facilitator focus groups at center
- 2.5 hours, conducted by Lachman in English
- Post-program (intervention, n=8; control, n=6)
o Interview protocols with open-ended approach:
- Acceptability of program content, delivery methods
- Changes observed at home by parents
- Training, supervision, and logistical support
What methods can researchers use to identify which components of an intervention are necessary for successful implementation? Why is it essential to do this?
Core components
o Also called ‘active ingredients’
o Primarily responsible for an intervention’s outcomes
IDENTIFYING CORE COMPONENTS
Meta-regression: linking components to intervention effect sizes
o Interventions that DO include component X versus interventions that DO NOT include component X (method used by Kaminski et al., 2008)
o Strengths:
- Based on dozens of studies
- Results do not hinge on single intervention ‘brand’ or trial
- Can include many difference types of components
o Limitations:
- No causality: only association between components and outcomes
- Results depend on patterns of combinations of components in existing programmes
Microtrials
o Randomized experiments testing effects of brief, focused manipulations designed to influence a risk or protective factor
o Strengths:
- Test causal effects of discrete intervention components on immediate or longer term outcomes
- e.g., effect of parent praise on child compliance (Leijten et al., 2015)
o Limitations:
- Test effects of components outside of their natural context (i.e., outside of regular intervention)
- Test only 1 or 2 components at once, and most interventions have lots of components
- Many test only short term effects (though this is not an inherent property of microtrials)
Factorial experiments
o MOST – The Multiphase Optimization Strategy (developed by Prof Linda Collins from the Methodology Lab at Penn State University)
o Based on resource management principles and research methodologies from the field of engineering
o Used for behavioral and bio-behavioral interventions
o “The process of identifying the intervention that provides the highest expected level of effectiveness obtainable…Within key constraints imposed by the need for efficiency, economy, and/or scalability.” (Collins, 2016)
IMPLEMENTATION: NOT JUST WHAT, BUT ALSO HOW
Why evaluate implementation?
o Without understanding of what was implemented, we cannot make meaningful connections between interventions and outcomes
o Avoid Type III Error (i.e., failure of implementation instead of theory)
o Determine heterogeneity within and between studies
o Inform dissemination of evidence-based interventions
What is the difference between an activity and a function and why are these concepts so important in the evaluation of social interventions? Use examples to illustrate your answer.
Intervention components: form or function?
Components as activities (form) – the specific action that occurs
o e.g., the action of praising a child; small-group training sessions
Components as functions – the targeted mechanism underlying the action
o e.g., positive attention and relationship building; normalizing occupational insecurities and reducing isolation
In reality, intervention components are often a combination of both activity and function: an activity is only relevant to the extent that it fulfills a function
An outcome evaluation should never be conducted on an intervention unless there is a process evaluation running parallel to it. Discuss.
text here
Analyses of moderators and mediators are potentially misleading. Discuss this statement.
Mediator analyses:
o Ask how and why do interventions work (mediator = link between)
Moderator analyses:
o Ask for whom do interventions work (moderator = alters, modifies effects)
o Are there differential effects for different subgroups?
- Important implications for ‘equity’ of effects; key topic in public health (Petticrew, 2011)
Caveats to mediator analysis: critical appraisal
o Even within trial, potential confounding – you haven’t randomised to the mediator, only to the whole package (but, at least there is lots of the mediator happening within the package; plus we tested a competing explanation)
o Mediator analyses often exploratory, not the main aim of the trial; study often not powered for this – need to proceed with caution
o Longitudinal – but need 3 time points to fully separate intervention, mediator and outcome for somewhat stronger causal inference
o On plus side, can provide relatively convincing causal evidence (compared to most other methods), in a design where intervention change introduced – this complements body of work on role of parenting from natural longitudinal studies, for example (Incredible Years parenting intervention)
Limitations of meta-analyses of moderator effects in parenting intervention trials:
o Most analysed predictors, not moderators
o Doesn’t tell us if some interventions are good at reaching the most distressed – can a very high-quality intervention overcome these differential effects?
o No longer up to date
o Some omitted studies give a different picture: Analyses of moderators in large ‘Incredible Years’ trials found opposite result for some factors (Beauchaine et al., 2005; Baydar et al., 2003, Gardner et al., 2010); also in Early Steps family intervention trial (Gardner et al., 2009; Gardner et al., 2017)
Critical appraisal: what limitations of moderator analyses in RCTs (including Gardner’s)?
o Power: RCTs powered for main effect analyses
o Cherry picking – evidence that reporting bias common in main effect analyses in trials (Dwan et al., 2008) – more so when it comes to secondary analyses?
o Multiple testing? Were the analyses pre-specified?
o Moderators confounded with each other (Lipsey, 2003)
o How to interpret mixed findings?
The continued prevalence of outcome reporting bias in primary (i.e., main effect) analyses of trials, as highlighted by the work of Dr. Kerry Dwan and colleagues (2008), raises concerns that such ‘cherry picking’ of favorable results may abound to an even greater extent in secondary analyses of mediators and moderators
Others note with apprehension the prohibitively low statistical power of a great deal of secondary analyses, finding that “[m]any [major clinical trial] reports put too much emphasis on subgroup analyses that commonly lacked statistical power” (Assmann, Pocock, Enos, and Kasten, 2000)
Mark W. Lipsey (2003) delineates the difficulties that can arise from confounding among moderator variables, a relatively likely turn of events in which the moderators under observation are found to be related to each other as well as to effect sizes
Issues of replicability have also been identified in relation to the secondary analyses of intervention data, with various moderator analyses yielding divergent—and sometimes contradictory—results (Baydar, Reid, and Webster-Stratton, 2003; Beauchaine, Webster-Stratton, and Reid, 2005; Gardner, Hutchings, Bywater, and Whitaker, 2010)
Long history of misgivings about moderator analyses in trials
o “Only one thing is worse than doing subgroup analyses – believing the results”—Richard Peto
o To prevent ‘cherry picking’ of results we need: Explicit pre-specification of hypotheses: confirmatory or exploratory, plus rationale (Rothwell, 2005; Wang and Ware, 2013)
o Need pre-registering – ORCHIDS a model for us all
What do we normally do when there are lots of data, but with low power and mixed findings?
o Easy – do a good systematic review, and where appropriate, meta-analyze data across trials
o For many questions, yes, but…
Consider limitations of analyzing moderators in systematic reviews
o Power still low: although total Ns may be higher, subgroups are coded at trial level, meaning all variability within trials in participant characteristics is masked (see Thompson and Higgins, 2005, Lancet)
o Cherry picking: risk lower?
o Moderators still confounded (Lipsey, 2003); one meta-analysis attempted to overcome this – Leijten et al., 2013, examined how SES moderator effects were confounded by other risk factors, such as problem severity
Use what we already have…
o Share and pool data from lots of trials, using individual level data
o Many advantages:
- Makes full use of within-trial variability in characteristics
- Greatly increases power for subgroup analyses
- Transparency—help prevent cherry picking
- Wider generalisability across communities, contexts, regions
The risk of ‘cherry picking’ only those results that suit a given narrative can be minimized by greater transparency, achievable in part through the explicit pre-specification of hypotheses, including whether a hypothesis is exploratory or confirmatory in nature, as well as its underlying rationale (Rothwell, 2005)
Pooling data
o Scientific benefits of sharing, collaboration between many investigator teams = better science
o Climate now is right: big push from funders, journals, governments to share data to increase transparency, reduce fraud (NIH, Ben Goldacre, BMJ AllTrials campaign)
o Example: NIMH Collaborative Data synthesis for Adolescent Depression Trials (Brown et al., 2013)
Pooling data from a number of trials can address the lack of statistical power common to secondary analyses, while also providing for greater generalizability across contexts (Brown et al., 2013; Gardner et al., 2017)
Conclusions: for whom do social interventions work and how?
o Secondary analysis of RCTs can answer questions about ‘For whom does it work’ (MODERATORS) and ‘How’ or ‘What are the critical ingredients’ (MEDIATORS)
o Can derive mediator hypotheses from theory, or from qualitative methods; e.g., users’ views on process (Bonell et al., 2012)
o Secondary analyses can be very useful, but important to be aware of their limitations – power and cherry picking, failure to replicate
o Can do systematic reviews of moderators, but NB meta-analyzing subgroups brings new problems—better to pool individual level data to give more power, and make use of full data on variability between individuals
Those who reject the necessity and/or practicality of conducting thorough mediator and moderator analyses of intervention data commonly cite inadequately powered studies, the potential for ‘cherry picking’ of results (i.e., reporting bias), and a failure of replication as inherent, insurmountable barriers to the completion of these secondary analyses
While concerns surrounding low power, cherry picking, and mixed findings have certainly been borne out in reality (Dwan et al., 2008; Lipsey, 2003), this heightened awareness of the potential limitations of secondary analyses grants us a valuable opportunity to institute measures designed to diminish the impact of foreseeable stumbling blocks, such as through the systematic pooling of individual-level data from a relatively large number of trials, which serves to not only increase power for subgroup analyses, but greatly enhances transparency and generalizability as well (Brown et al., 2013; Gardner et al., 2017)
Fundamentally, much of what constitutes ‘good practice’ in conducting secondary analyses mirrors or closely tracks that which is expected in primary analyses, with transparency—and pre-registration—being of paramount importance no matter the level of analysis