Lecture 7: General Linear Model (GLM) and fMRI Flashcards
Diagram of fMRI data and time series graph- (3)
- Selected at specific voxel with green lines on right
- After doing preprocessing steps, it shows graph of brightness of the voxel selected over time
- The red line in the graph shows you what the voxel’s signal looks like over time
- Its activity is preferred in some conditions over the other - peaks
What is the general idea of the general linear model?
- Brain activity in each voxels is explained in terms of set of mixture of different responses to each of the conditions in experiment
Data from early MRI experiment looking FFA (first papers explaining FFA is responsible for processing faces) time series plot show.. and statistical analysis used…- (6)
- Pps viewing pictures of faces (dark grey) or objects (light grey) in blocked design expeirment in specific voxel in FFA
- Response to faces higher than responses to objects
- Wanted to quantify statistical if height in face blocks higher than response to objects
- In early experiments analysis, averaging signal changes in blocks of faces vs objects and comparing the two using convential statistical tests
- Slightly imprecise way to analyse data as change in BOLD signal does not follow immediately after change in condition –> lag before hemodyanmic response kick in and takes a while to evolve
- Use info to provide more accurate info to estimate the signal changes and use across different designs –> GLM
The general linear model (GLM) allows us to
estimate the degree of signal change associated with each condition in the experiment
General linear model is more accurate than
simply averaging
GLM uses the (approximate) linearity of the BOLD signal means
GLM can be extended to a wide variety of experimental designs
What is this graph showing? - (6)
- Blue in one condition
- Green in another condition
- White is rest
- Time series across the x -axis
- fMRI signal is on y-axis
- Data is called X when talking about mathematically
Diagram of step 1 of GLM in which we specificy time periods corresponding to specifc task/stimulus conditions
In first step of GLM of speciifcying time periods corresponding to specific task/stimulus conditions we expect that.. - (2)
- We imagine ther is some neural activity that pps given one or another task
- Neural activity we are expecting to see in given (blue) condition of a specific voxel is expecting to arise at beginning of the task, plateau (constant) during the task block and at end of block neural activity drops to 0 until next blue block happens
Diagram of Seconed step of GLM creating hemodynamic regressors (by convolving with a canoical HRF) from the specifications of time periods corresponding to specific task/stimulus
Second step of GLM of producing hemodynamic regressors (by convolving with canoical HRF) means if this is the neural activity of neurons in a specific voxel in this graph then expect the
- expect following pattern of blood flow changes (modelled BOLD signal changes on y-axis) that we would expect happen at the specific voxel on second graph
In second step of GLM, to create the second graph on the right, the software..
convolving the modelled neural activity with canonical (typical) hemodynamic response function
The term canonical just means
typical
What does convolution?
- It is a mathematical operation that takes neural activity at each moment by the canoical HRF and then adds the moments together
What does this diagram show? - (2)
- At top shows canonical (typical) hemodynamic response function (HRF) looks like which is changes in BOLD signal we would expect to see for a single moment of neural activity (graph at middle = modelled neural activity)
- Convolution multiples canonical HRF * modelled neural activity for every moment in experiment and added together
What does convolution rely on?
the approximate linearity of the BOLD signal
What does HRF stand for?
Hemodynamic response function
Diagram of step 3 of GLM of repeating steps 1 and 2 (specificy time periods to specific task and produce hemodynamic regressors) which shows.. - (5)
- EV stands for explainatory variables
- Specificed time periods corresponding to each stimulus conditions of blue and green and created hemodynamic regressors as EV1 and EV2 show y axis of modelled BOLD signal change
- These EV used to produce the regressors
- EV1 of blue condition used to produce G1 regressor for blue condition
- EV2 of green condition used to produce G2 regressor for green condition
Diagram of step 4 of GLM of fitting regressors to the data of fMRI signal of specific voxel with aim of determining how much each regressor and bits leftover we can’t explain contributes to observed pattern of signal change which means.. - (2)
- Explain amount of response of fMRI signal to a specific voxel from data X to condition 1 in terms one regressor - G1
- Explain amount of response of fMRI signal to a specific voxel from data X to condition 2 in terms another regressor - G2
Step 4 of fitting regressors to the data
we also have a constant term in which - (2)
- Y axis is in arbitary units
- We fit constant term to whole experiment and to its data (Red) in our regression which represents how bright the voxel is throughout the experiment ignoring the activity the brain activity changes (activity of specific voxel is nothing is happening, no condition)
Step 4 of fitting regressors to data shows.. - (2)
- Best combination of G1 and G2 when added together (black line) looks quite similar to the data of activity of a voxel
- The residuals (diff between black and red line) at each moment in time and added together is as small as it can be
Summary of our GLM steps , our fitted model, is.. - (2)
- Our fitted model we call M is beta coefficient 1 of G1 (blue condition) + beta coefficient 2 of g2 (green condition) + some constant number called B3 + residuals
- Multiply by different beta coefficients to enable best possible fit of data
If residuals are huge then means the fitted GLM model is not a good fit
for the data
Steps of general linear model - (5)
- We begin to specificy time periods of modelled neural activity to specific task/stimulus
- From this, we produce the hemodynamic regressors (by modelled neural activity with canoical hemodynamic response function)
- This is done for each experimental condition
- We then fit regressors (G1 and G2) to the data of fMRI signal of specific voxel with aim of determining how much each regressor and bits leftover we can’t explain contributes to observed pattern of signal change
- We fit constant to the regression model as well
Parameter estimates are
computing a model’s parameter values from measured data such as beta coefficients and residuals to fit model to data
What is blocked design?
multiple trials from the same condition are presented consecutively.
The block design is insensitive to the
variations in hemodynamic response function across people
The blocked design
prolonged blocks (not use blocks shorter than 6s and typically less than 30s) - in between those two is optimal
The blocked design individual trials within short intervals are
indistinguishable (e.g., can’t tell beta value caused by last trial in a block that shows a picture or another trial which showed another picture)
Blocked design has more power to detect
differences between conditions (e.g., more power to detect differences of beta 1 and beta 2)
Blocked design can lead to
predictability/anticipation
Diagram of block design of modelled neural activity shows
- specificed time periods for each block shows pictures of faces, scenes, scrambled scenes that is ordered psuedo randomly with each space in between
Diagram of modelled hemodynamic response function in blocked design shows..
What happenes in event-related design?
Individual brief trials from diff conditions are interspered in random sequence with gaps of varying duration
With event-related design it is unpredictable so it avoids
anticipation
In event-related design, individual trials can be distinguished at
analysis stage (e.g., identify trials where someone respond correctly and compare with trials who responded incorrectly and see difference in brain activity)
Event-related design can provide a detailed estimate of
temporal dynamics (changes of BOLD signal at each voxel , not used at much)
Optimal sequence of trials in event-related design to lead to big reliable BOLD signal changes between diff conditions which can can rival the
block design but difficult to construct
The timing and sequences of events has a big impact of the efficiency of
event-related design
Programs like opsteq can be used to optimise the sequence of events in
event-related design (which generates large number and selects more efficient)
Empty periods/null events of varying duration can also be included in
event-related design
Diagram of modelled neural activity over time with event-related design of different conditions of faces, scenes and scrambled faces is intermixed
Diagram of modelled hemodynamic response over time with event-related design of different conditions of faces, scenes and scrambled faces is intermixed
There is a problem if the regressors is correlated with
each other
If the same regressors for G1 and G2 were entered twice into analysis then - (2)
would not be possible to determine unique solution for the best fitting values of beta 1 and beta 2 and run the general linear model
don’t know whats relative contribution of G1 and G2 to the activity of that specific voxel
You won’t be able to run analysis if your regressors were the
same
If pair of regressors were strongly correlated with one another (i.e., G1 and G2 are similar) then difficult to say whether … and in terms of GLM - (2)
activity of voxel is driven by one regressor over the other
causes GLM to be unstable as answer may be inaccurate or answer may underestimate the neural activity that is happening
If a pair of regressors are strongly correlated they are said to
colinear
iIn colinear regressors it is not clear how the variance in fMRI signal of specific voxel should be
attributed to the different regressors
A more subtle problem of colinearity regressors is multicolinearity is when - (2)
a linear combination of two or more regressors is correlated with another regressor
e.g., G3 approximately same as 5 lots of G1 and 2.3 lots of G2
Identically all regressors in GLM should be
orthogonal (uncorrelated)
What is mulitcolinearity?
an independent variable is highly correlated with one or more of the other independent variables in a multiple regression equation
There is softwares that can check for
colinearity in regressors or regressors are the same
If regressors correlated then good to change the
experimental design
If you don’t space things/events out accordinaly in an fMRI experiment then lead to - (3)
- correlated regressors
- Lead to senstivity in experiment design, can’t tell difference in neural activity acrosss conditions
- Non-linearity of BOLD response putting events close in tme
What does this diagram show? - (3)
- Create image of entire brain that summaries value of beta 1 , beta 2, beta 3 and error for every voxel
- For every voxel in brain, carrying same process of fiting general linear model and working out best fitting parameter estimates of beta 1 and beta 2 and when these multiplied by regressors they leave the smallest possible residual
- Beta coefficients are calculated for every voxel in pp’s brain
In FSL the coefficients beta are called ..
parameter estimate (PE)
In FSL the coefficients beta are called ..
parameter estimate (PE)
In FSL the parameter estimates are combined in linear combinations to form - (2)
form contrasts (beta 1 [parameter estimate 1] minus beta 2 [parameter estimate 2])
or in FSL-speak the constrats of parameter estimates (COPE)
Why would you take contrasts of parameter estimates for example when showing pps pictures of faces vs places?
It would show voxels in brain where its response is bigger to faces than places
These images below are called
COPE (contrasts of parameter estimates) - seeing after the analysis
In this example,
B1 > B2 (values higher in B1) so…
image of COPE (on right, b1 - b2) will show areas greater than
0 correspond to parts of brain where BOLD signal was greater during condition 1 than condition 2
Contrasts of parameter estimates is specificed due to the
experimental question
What is group analysis?
Combines the data from multiple participants
In whole-brain group analysis (grp analysis), it demands that the different brains of the pps
are aligned with one another in same space - called coregresitration/normalisation
Issue of aliigning the different participants’ brain (pps have different brain anatomy)
is important factor in MRI analysis
How to aligin the different brains of the pps? - (3)
- Take the individual’s pp timseries
- Analyse them in GLM and have whole bunch of beta images for each pp = COPE - b1 - b2
- Put all beta images into combined statistical map
Coregresitration/normalisation can be done as a
preprocessing step (SPM) or after first-level analysis (FSL)
In coregresitration we use all the pps structural images (showing detailed image of brain) that can algin with a
standard brain
we can say for any voxel in the brain we can say is the value of the contrast (b1-b2) significantly
different from 0
, For every voxel in the pp’s brain we have a mean (and SD) for the
differnece (contrast) between B1 - B2 - across 18 pps in this case
For each voxel having mean (and SD) for difference between beta 1 - beta 2 (B1 - B2)
We expect this under the H0 null hypothesis to be
0
We can use convential single sample t-test to determine probability (p-value) that the observed difference between
B1 - B2 would occur by chance alone
Take b1 - b2 number of pp’s brain at each voxel and compare it with 0 using a
t-test
FSL (FLAME) actually does something slightly more complex (but more accurate) than single sample t-test in which - (2)
- analysing residuals in the fitted model to yield separate estimates of within- and between-subject variance to give a better account of whether an effect is truly reliable across subjects - mixed effects approach
- Results should generalisable to wider population
What is good practices of fMRI design? - (6)
Task/stimuli evoke process of interest
Collect as much data as possible from each participant
In group studies (between subject , e.g., misophonia and control)collect as many data from as many participants as possible - give more stats power but difficult as scanning time is limited - scanner costs money
**Choose stimulus conditions and timing to evoke maximal changes in the process of interest over time (taking account of hemodynamic response and space events in time)
Organize timing of stimuli/task so processes of interest are minimally correlated over time - don’t want regressors correlated over time (as discussed) , hard to see difference of activity of interest
Obtain behavioural measures that can be related to fMRI activation ( beh measures to validate pps doing task, encourage pps to engage stimulus materials in right way which engage process of interest and reveal more info of how brain is working in task) **
Diagram below shows
t statistic calculated at each voxel in pp’s brain and tells p-value which tells us whether it deviates from 0 or not
How do we threshold the t statistic in order to determine whether there is a significant difference in brain activity in given part of brain?
Using thresholds
What does diagram show at the left? - (3)
- Threshold t statistic of voxels of greater than 1.75
- Used p < 0.05 (used in convential stats)
- This is uncorrected and does not take account that there are millions of voxels in the brain
- Some of voxels sig and active on brain map at bottom by chance alone - some being false positives
Thresholds of t statistic does not change… but determines which.. - (2)
- Does not change underlying t-statistic, beta-values, contrasts of parameter estimates
- Changes what voxels we show and what voxels we hide - shown in brain map at bottom
Example diagram of principle ways of determing the value of t (i.e., threshold of t statistic) we should look at - (2)
- Threshold calculated using Random Field Theory
- Tiny set of voxel that remain statistically significantly active by using more consertive approach of choosing thresholds
Diagram of statistical map
At top shows t has to be bigger than 2.10
What do these statistical maps indicate?
- Indicate the probability of type I error (False positive)
In statistical maps, the lower the probability (p-value in general) of type 1 error the
the brighter or hotter the colour is and higher value of t statistic
Statistical maps can have the statistical results shown overlaid on background image of brain like one at top for
anatomical context (often a co-registered structural image = T1 image or standard brain template)
In statistical maps the value typically displayed is actually a statistical parameter such as..
t statistic or z score wich can be converted to a p-value
A threshold is used on statistical maps to exclude
values corresponding to higher-p-values (more likely to be false positives) which is not shown
A bonferroni correction means dividing the
p-value we will accept (e.g., 0.05) by number of comparisons made
A problem using Bonferroni correction for thresholds of t-statistic is that it is too
conservative
A variety of threshold methods have been developed to choose an appropriate threshold - (4) in whole brain analyses
Bonferroni correction (takes account of number of comparisons)
Gaussian Random Fields FWE voxel correction (takes account of number of independent comparisons from smoothness/lumpness of data)
FDR (false-disocvery rate) correction
cluster-size correction - common
What does FDR (False-discovery rate) correction do? - (3)
instead of controlling for the likelhood of false positives
they accept they will see a small proportion of voxels that are false discoveries and this correction will limit it
Calculate that number and appropriate threshold to use
cluster size correction - how does it work? - (4)
takes account of contiguity of active voxels - as true signals tend to activate not one voxel but voxels next to each other in a region that is activated during pps task
false ones activate randomly activated voxels
work out how many voxels we would see next to each other by chance alone
produce cluster size threshold not showing any activation that is similar than a certain numbef of voxels
Thresholds for statistical parameters for a specific area is called
region of interest analysis
Region of interest analysis can be used on multiple
comparisons
Instead of analysing all voxels, in region of interest analysis they analyse the
specific pre-defined regions based on hypotheses
Region of interest analysis is based on
summary statistics of averaging contrast/beta values values within specificed voxels in specific region and doing statistics on those averaged values across all pps to see if diff is statistically sig in labelled part of brain
Regions we can choose in Regions of Interest (ROI) analysis is… - (3)
- Identify anatomical regions using atlas
- Identify set of voxels to analyse using a mask
- Looking at specific coordinates of brain and taking a spherical region around coordinate - maybe based on literature
Anatomical ROIs selected (via altlas/coordinates) based on
theory or previous data
Functional ROIs can be selected based on
separate localiser task (additional task in experiment) which purposely makes specific regions of the brain active and find which voxels are active which is then used to analyse a separate piece of data
ROIs - regions we are interested in must be defined
independently of the data to be analysed
If ROIs - regions you are interested in are defined dependent on data to be analysed then results will be
bias and inflated
Used properly ROI analysis can avoid problems with
multiple comparisons
Problems with ROIs - (2)
- May miss unexpected findings outside ROI
- Beware misleading impression of selectivity (e.g., looked at one area we defined and shows big pattern of activity showing selectivity of something but other brain areas we haven’t defined can be involved)
To avoid problems of ROI analysis it is often combined with
whole-brain analysis
o avoid problem
Problems for reproducibile results in fMRI - (6)
- Issues of false positives
- Low power (typically due to small sample sizes)
- The flexibility of Thresholding (correction for multiple comparisons) –> potentially hide stuff that is there due to conservative thresholds
- Researcher Degrees of Freedom –> “p-hacking” = many ways to select so many options (e.g., correct ionsfor motion, slice timings) to analyse the data and choosing one that is favourable – > sig results
- Engage in HARKING due to strict guidelines of reporting exploration analysis
- Software errors leads to unexpected bugs in fMRI analysis
- Insufficient study reporting
The analysis of Buttoon et al., highlighted the point that low power in fMRI studies not only reduces the .. but also raises.. (2)
reduces the likelhood of finding a true result if it exsits
Also raises the likelhood any positive result is false (false positives) and inflates effects sizes
Button et al’s research only considered structural
MRI studies
Poldrack (2017) analysed sample sizes over two decades in fMRI research and showed that - (2)
sample sizes has increased steadily over past two decades
median estimated sample size for a single-group fMRI study in 2015 at 28.5.
Poldrack (2017) analysis of sample sizes from two decades of MRI research also showed that number of studies with large samples (greater than 100) is increasingly rapidly as
8 in 2012 to 17 in 2015
Poldrack (2017) estimated the standardsised effect size that would be required to detect an effect with 80% of power and alpha of 0.05 (standard in most fields) for by extracting sample sizes from two decades (20 years) of fMRI studies
(finding minium effect size needed in each studies for diff to be statistically significant with an 80% probability, given sample size)
and found… - (3)
Despite the decreases in these hypothetical required effect sizes over the past 20 years,
in 2015 the median study was only sufficiently powered
to detect relatively large effects of greater than ~0.75
. Given that many studies will be assessing group
differences (between-subject) which has less power than within-group - this is still good result
The effect size Poldrack found of median of 2015 studies ~ 0.75 was….. and an example - (4)
much larger compared to typical effect sizes observed in task-related BOLD imagning studies
e.g, effect sizes are relatively small even for a wide range of cognitive tasks even powerfull ones like motor tasks
75% of voxels have effect size smaller than 1
task evoking weaker activation like gambling only 10% of voxels demonstrated effect sizes larger than 0.5
What soltuions for fMRI studies with lower power? - (3)
- When possible, all sample sizes should be justified by an a priori power analysis
- When researchers have to use sstatistically insufficient sample size due to limitations of specific sample (rare patient grp) then collect as much data from each individual as possible and presentas individual than grp level, use liberal statistical threshold using false discovery rate (FDR) b [increase false positive results] and restrict using small number of prior ROIs (based on research) or using functional localiser to identify ROIs for each individual
- Use Bayesian methods for small unpowdered samples
Researchers degree of freedom means
the inherent flexibility involved in the process of designing and conducting a scientific experiment
Researchers degree of reedom can lead to inflation of.. even when there is no intentional P-hacking and only a single analysis is ever conducted
type 1 errors (false positives)
Whats p-hacking?
the inappropriate manipulation of data analysis to enable a favoured result to be presented as statistically significant.
What did Carp study show in terms of researcher’s DF?
Poldrack paper - (3)
Carp applied 6,912 analysis workflows options
(using the SPM) and another software to a single data set and quantified the variability in resulting statistical maps.
This approach revealed that some brain regions exhibited more substantial variation across the different workflows than did other regions.
This issue is not unique to fMRI; for example, similar issues have been raised in genetics
Exploration is key to scientific discovery but rarely done in research paper comprehensively describe it as - (3)
- process of exploration would result in narrative of paper being too complex
- As a clean and simple narrative has become an essential component of publication
- Instead researchers engage in HARKing as it hides the number of data-driven choices that are made in analysis to make paper clearler and strongly overstate the actual evidence for hypothesis
What does HARKing stand for?
Hypothesisng after the results are known
What is the solutions for flexibility in analysis, researcher’s DF and HARKing? - (4)
- Pre-registration of methods and analysis plan including planned sample size, specific analysis tools used, specification of predicted outcomes and definition of any specific ROIs or localiser stratgeries used for analysis via Open science
- Submit a registerted Report in which hypotheses and methods are reviewied before data collection and study is guranateed for publication regardless of outcome
- Exploratory analyses (including deviation from planned analyses) should be clearly distinguished from planned analyses in publication
- Allow flexibility in fMRI design but require all exploratory analyses to be labelled and encourage validation of exploratory results via separate data set
The most common approach to neuroimaging analysis
involves mass univariate testing in which a separate
hypothesis test is performed for each voxel. In such an
approach, the ….. if there is no correction for multiple tests
false positive rate will be inflated
What was dead salmon study?
activation’ was detected in the brain of a dead salmon but
disappeared when the proper corrections for multiple
comparisons were performed.
Poldrack (2017) showed that analysing data wrong through combination of failure to adequately correct ofr multiple comparisons and ROI analysis they found that - (2):
- Their univarate analysis to assess correlation between activation in each voxel and simulated behaviour regressor across pps was significant
- There was a cluster of false-positive activation in superior temporal gyrus
There is a problem according to Podrack (2017) in which researchers use inconsistent appplication of correction approaches as…
Many researchers freely combine different approaches and thresholds in ways that produce a high number of undocumented researcher degrees of freedom
, rendering reported P values uninterpretable.
Eevn though well established and validtated methods for correction of multiple comparisons has been developed, like FWE and FDR, most well-established methods
can produce inflated type 1 error rates in certain settings e.g.. cluster forming threshold too low
What did Podrack (2017) found to investigate corrections for multiple comparisons by extracting tons of articles? - (2)”
- Only 3 of the pappers presented fully uncorrected results suggesting reporting corrections for multiple comparisons is now the standard
- 9 papers used FSL OR SPM to perform primary analysis then used AlphaSim to correct for multiple comparisons - not standard method (might engage in analytic P-hacking) and AlphaSim inflates type 1 error.
What is the solutions for multiple comparison corrections? - (3)
- To balance type I and type II error rates, dual approach of reporting whole-brain results , sharing unthresholded statistical map
- Any use of non-standard methos for correction of multiple comparisons (using tools from different packages) should be justified explicity
- Alternatively, one can abnadom the mass univariate and use multivariate methods - MVPA
As the complexity of a software program increases,
the likelihood of undiscovered bugs quickly
certainity in MRI analysis
Why is there undiscovered bugs in fMRI analysis? - (2)
Most fMRI researchers use one of several open-source analysis packages for pre-processing and statistical analyses;
many additional analyses require custom programs.
Because most researchers writing custom code are not
trained in software engineering, there is insufficient
attention to good software-development practices that
could help to catch and prevent errors.
15 year old bug was discovered in AFNI program which
slightly type 1 error rates
The discovery of these errors in fMRI analysis which led us to perform a code review and include software tests to reduce
likelihood of remaining errors.
What is the solutions to solve for bugs in fMRI analysis? - (6)
- Researchers should choose to solve problems using software tools instead of re-implenting the same method in custom code
- Researchers should learn and implenet good programming practices, including software testing and validation (e.g., comparing code with existing implenetation or using stimulated data)
- Custom analysis code should be shared on manuscript submission
- Journal reviewers to evaluate code in addition to manuscript itself in some journals
- Journal reviews should request the code is made avaliable publicly so others can evaluate it
- Researchers need sufficient training on how to conduct analysis method they are doing
There is a lack of few examples of direct replications in the field of neuroimagning which reflects both the
expense of fMRI studies and the emphasis of
most top journals on novelty rather than informativeness.
Although there are many basic results that are
clearly replicable (for example, the presence of activity
in the ventral temporal cortex that is selective for faces
over scenes, or systematic correlations within functional networks in the resting state), the replicability
of
weaker and less neurobiologically established effects
(for example, group differences and between-subject
correlations) is nowhere near as certain.
What is solution with lack of independent replications in fMRI? - (2)
- Neuroimagning community acknowledge replications as scientifically important research outcomes
- There is replication award for neuroimagning