Evaluation design I Flashcards
stages of evaluation
formative
process
outcome
The main purpose of evaluation design is to be as confidence as poss that any observed changes were cause by intervention, rather than by chance/other unknown factors
formative eval
Before intervention
Acceptability and feasibility of intervention
Mainly qual, e.g. focus groups, in depth interviews
formative eval research
Penn et al. (2018)
Penn et al. (2018)
NHS diabetes prevention programme
eval - qual research
behav interventions
specification reflected evidence - framework for service provision
provides ev based behav intervention for prevention of T2D in high risk adults
process eval
Measures how intervention delivered and received
Mixed qual and quan
Done along the way - make alterations if necessary
process eval research
Sanchez et al. (2017)
Sanchez et al. (2017)
improve understanding of underlying mechanisms that may impact results
prescribe healthy life intervention
moderate–>good performance on adoption, reach and implementation
outcome eval
Measures whether intervention achieved objectives
Mainly quan
outcome eval research
Ebert et al. (2018)
Ebert et al. (2018)
internet and mobile stress management intervention and RCT
intervention v control
int = 7 sessions of problem solving and emotion regulation techniques
baseline v 6 months
cost-effective and lead to cost savings
stages of evaluation research
Dehar et al. (1993)
Nutbeam (1998)
Dehar et al. (1993)
formative - develop and improve programmes at an early stage
process - info on programme implementation, interpretation of outcomes and guiding future research
Nutbeam (1998)
issues with definition and measurement of outcomes and use of eval methods
most powerful interventions = LT
technical problems from scientific rigour and advantages of less well defined content
combine and quan and qual
evals tailored to intervention
cause and effect
Want to determine whether a cause-effect r’ship exists between intervention and outcome
logic of causal inference
Under what conditions may we infer that a change in the DV (PA) was really caused by IV (intervention) and not by something else (envs etc.)
What are some of the most plausible rival explanations, and how do we rule them out?
Logic of causal inference research
Rothman and Greenland (2004)
Hill (1975)
Rothman and Greenland (2004)
causality often debated
more general conceptual model required
Hill (1975)
https://journals.sagepub.com/doi/pdf/10.1177/003591576505800503)
criteria for inferring causality
temporal r’ship
plausibility
strength of association
dose-response r’ship
reversibility
temporal r’ship
Cause (intervention)must precede effect (increase in PA)
plausibility
Association plausible, and more likely to be causal, if consistent with other knowledge
E.g. reasonable to expect people who receive exercise intervention will increase PA by greater amount than people without intervention
strength of association
Strong association, as measured by effect size or relative risk, more likely to be causal than weak association
dose-response r’ship
Occurs when changes in level of cause associated with changes in prevalence/incidence of effect
Greater adherence to exercise intervention (greater dose) the greater change in fitness
Incremental change in outcome, related to dose of intervention
1 intervention - adherence = linked to dose
reversibility
When removal of cause (intervention) results in return to baseline in outcome (PA), likelihood of association being causal strengthened
Internal validity
High internal val means that diffs observed between groups related to intervention tested in trial
Means for example, that change in PA in study popn attributed to intervention and not to other factors, such as age, sex/social status
IV has to be only thing influencing DV
Extent to which you’re able to say that no other variables, apart from the IV, caused a change in the DV
Control and treatment group - treat them the same - see whether other variables influencing results
Internal validity research
Halperin et al. (2015)
Halperin et al. (2015)
internal val - degree of control exerted over confounds to reduce alternative explanations for results
e.g. controlling ex protocols, prior training, nutritional intake, age, gender etc.
less well controlled: instructions given, verbal encouragement, no. of observers and mental fatigue
External validity
Described extent to which results of experiment of intervention can be generalised to target/general popn
The extent to which the results of a study are generalisable to other situs/groups
Good sampling needed
- Representative of wider popn
- Large enough to have adequate power
- Exclusion criteria should relate to Q of interest
External validity research
Leviton (2017)
Leviton (2017)
need greater focus on external val
goal is applicability
methods
- better description of int
- combining of stat tools and logic to draw inferences about samples
- sharper definition of theory
- more systematic consultation of practitioners
Validity research
Slack and Draugalis (2001)
Blackman et al. (2013)
Slack and Draugalis (2001)
establishing internal val of study based on logical process
- provided by reports structure
- methods describes how threats minimised
- discussion assesses influence of bias
threats: history, maturation, testing, instrumentation, regression, selection, exp mortality and interaction
cog map used to guide when addressing threats
- logical description of internal val problems
threats source of extraneous variance
external val addressed by delineating inclusion and exclusion criteria, describing subjects in terms relevant variables and assessing generalizability
Blackman et al. (2013)
mobile health intervention
see how well inform generalizability
majority RCTs
few studies addressed reach and adoption
research designs need to focus on validity (internal and external)
Types of evaluation design
Experimental
Quasi-experimental
Non-experimental
Experimental
Compares intervention with non-intervention
Uses controls that are randomly assigned
Experimental research
Sanson-Fisher et al. (2007)
Bernal et al. (2018)
Sanson-Fisher et al. (2007)
RCT often has limitations
alternatives investigated
Bernal et al. (2018)
control to minimise confounds
consider confounds before deciding on control
quasi-experimental research
Reeves et al. (2017)
Handley et al. (2011)
Reeves et al. (2017)
created checklist for quasi-exp designs
- clustering of int as aspect of allocation/due to intrinsic nature of the delivery of intervention
- for whom and when outcome data available
- how intervention effect estimated
- principle underlying control for confounding
- how groups formed
- features of study carried out after designed
- variables measured before intervention
Handley et al. (2011)
stepped-wedge and wait-list cross-over design
relevant design features
- creation of cohort over time that collects control data but allows all Ps to receive intervention
- staggered intro of clusters
- multiple data collection points
- one-way cross over into intervention arm
practical considerations
- randomisation v stratification
- training run in phases
- extended time period for overall study completion
non-experimental research
Thomson et al. (2001)
Thomson et al. (2001)
systematic review of exp and non-exp housing intervention studies
able to demo health gains even via non exp interventions
some issues with generalizability
experimental examples
RCT
Pre-post design with randomised control group is one example of RCT
experimental strengths
Can infer causality with highest degree of confidence
experimental challenges
Most resource-intensive
Requires ensuring min extraneous factors
Challenging to generalise to “real world”
quasi-experimental
Compares intervention with non-intervention
Uses controls or comparison groups that are not randomly assigned
quasi-experimental examples
Pre-post design with a non-randomised comparison group
quasi-experimental strengths
Can be used when you are unable to randomise a control group, but you will still be able to compare across groups and/ time points
quasi-experimental challenges
Diffs between comparison groups may confound
Group selection critical
Moderate confidence in inferring causality
non-experimental
Don’t use control/comparison groups
non-experimental examples
Case control (post-intervention only): retrospectively compares data between intervention and non-intervention groups
Pre-post with no control: data from one group compared before and after training intervention
non-experimental strengths
Simple design, used when baseline data and/ comparison groups not available and for descriptive study
May require least resources to conduct eval
non-experimental challenges
Minimal ability to infer causality
Evaluation in a nutshell (Bauman and Nutbeam, 2014)
see notes