Week 3: GLM part 2 + Experimental designs Flashcards

1
Q

What we can control in experimental designs

A

What we present and when

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Main effects

A

effects of a single condition, collapsing over the other(s). For example, testing whether red stimuli lead to different activity levels than green stimuli (regardless of shape) would be a test of a main effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Factorial design

A

Factorial designs are designs in which each event (e.g., stimulus) may be represented by a combination of different conditions. For example, you could show images of squares and circles (condition 1: shape) which may be either green or red (condition 2: color)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In neuroscience, most hypotheses are…

A

directional (e.g., red > green)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parametric design

A

So far, we have discussed only designs with conditions that are categorical, such as “male vs. female faces” and “circles vs. squares”. The independent variables in your experimental design, however, do not have to be categorical! They can be continuous or ordinal, meaning that a particular variable might have different values (or “weights”) across trials. Designs involving continuously varying properties are often called parametric designs or parametric modulation. One hypothesis you might be interested in is whether there are voxels/brain regions which response is modulated by the reward magnitude (e.g., higher activity for larger rewards, or vice versa). In parametric designs, we create two regressors for every parametric modulation: one for the unmodulated response and one for the modulated response.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To obtain large effects (i.e. t-values), we need three things

A
  1. a large response/effect (e.g., beta)
  2. an efficient design or, in other words, low design variance (can be done prior to collecting data)
  3. Low noise/unexplained variance (this can be delat with during pre-processing)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Efficiency is the inverse of…

A

the design variance (e.g., high eff = low design variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Design variance is…

A

the part of the beta’s standard error that is caused by the design matrix (X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Researchers do not need to acquire (fMRI) data to calculate the efficiency of their design (X). Why?

A

We do not need to acquire data to calculate the efficiency because the formula for the efficiency only relies on the following values: stimuli, onsets and ordering of stimuli. In other words, the efficiency formula only relies on X (the design matrix), and not y (the actual signal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

We want high variance within our predictors because…

A

…we want to base our model on a wide range of values that represent the whole population sample!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Reason about this

A

You probably by now understand what’s the culprit: the design-variance! Given that the effect (beta IQ) is about the same for the two models and the MSE is higher for the high-variance model, the logical conclusion is that the design-variance of the high-variance model must be waaaaay lower.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Detection

A

As you can see in the plot above, a blocked design groups trials of the same condition together in blocks, while the event-related design is completely random in the sequence of trials. Note that designs can of course also be a “mixture” between blocked and event-related (e.g., largely random with some “blocks” in between).

So, if we’re interested in detection (i.e., the amplitude of the response), what should we choose? Well, the answer is simple: blocked designs.

This is because blocked designs simply (almost always) have lower design variance because of:

lower covariance (“correlation”)
higher variance (“spread”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Block designs

A
  • Bigger response between baseline and experimental > we are flooding stuff though, the nature of prediction is therefore weaker
  • Similar events are grouped together
  • Two condition block design with 16-20 sec blocks maximize power > but not always applicable!
  • If we are interested in detection (which we almost always are), blocked designs are better because:
    > Lower covariance (no risk of overlap like in event designs)
    > Higher predictor variance = more “spread”
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Event-related designs

A
  • Better!! (ignore what it says in the notebook!)
  • Events are mixed together
  • Good signal-to-noise ratio
    • jitter (semi-random ISI) = good for statistical efficiency (jitter = higher efficiency because we randomise the ISI)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Psychological principles (1)

Stimulus predictability

A
  • Influences psychological state
  • e.g., go/no-go task: the predictability of the no-go stimulus determines how hard it is to not respond (event related better than block in this case)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Psychological principles (2)

Time on task

A
  • We can only image what subjects are doing, so they should be doing what we want them to do as much as possible
  • Recognition time ~ 250 ms!! > remember that we recognise objects very fast, so the stimulus should be presented for <= 250 ms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Psychological principles (3)

Participant strategy

A
  • Different stimulus configurations afford different strategies (e.g., stroop task)
  • Compatible trials vs. incompatible trials = compare these two to study cognitive control (e.g., people respond more slowly and less accurately in incompatible trials)
  • If we take the stroop task, it makes more sense to use it with the blocked design
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Psychological principles (4)

Temporal precision of psychological manipulation

A
  • What we expect from subjects should fit with what subjects can do
  • e.g., sad vs. happy memories > block/event related is harder, people cannot switch between emotions that fast
  • Solution: single epoch design (e.g., fixation baseline > emotional induction > emotional state > recovery induction > fixation baseline): this is viable because it does not present an emotion twice; emotions will not mean the same thing twice in a row !
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Psychological principles (5)

Unintended brain activity

A
  • Brain imaging can capture all kinds of mental processes
  • e.g., spatial attention shifting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Overview

Experimental design

A
  • Design: how many independent variables will be manipulated
  • Trials: how are events organized (events, block, rapid, etc…)
  • The design is all handled by the GLM! (BUT, there are other, novel ones that cannot be fully handled by the GLM: e.g., mediation, connectivity, classification/prediction, RSA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Kinds of designs (1)

Subtraction/pure insertion

A
  • we can compare more complex conditions to simpler ones, by subtracting the activity of the simpler condition activity; what is left over must be due to the complex condition.
  • the goal is to isolate a single neural process
  • assumes that neural processes can be summed linearly
  • assumes that the neural processes associted to each task are NOT interacting with eachother

Issues: interaction with context & pure insertion often violated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Kinds of designs (2)

Multiple subtraction

A
  • e.g., (task A – task B – task C)
  • Can avoid issues with pure insertion (context issue)
  • are useful for increasing specificity of the conclusions you can draw from your results.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Kinds of designs (3)

Factorial

A
  • an approach that characterizes interaction between processes
  • interaction effect = main effect 1 x main effect 2 (e.g., gender x expression)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Kinds of designs (4)

Parametric

A
  • Designs involving continuously varying properties (different parametric values) > continuous independent variables (vs. discrete as in factorial)
  • Key takeaway: Does the BOLD signal seem to increase as the intensity of the stimulus increase? > increase our confidence that there is a relationship
  • Two levels (we assume that our design affects the voxel response in two ways):
    > Unmodulated response > a response to the task independent of the parametric value
    > PREDICTOR: stick predictor (1s and 0s)
    > Modulated response > a response to the task dependent on the parametric value
    > PREDICTOR: mean-subtracted e.g., reward magnitude; we subtract to decorrelate modulated from unmodulated responses

Very likely, the betas for the modulated predictor will have a greater effect on the voxel activity than the unmodulated predictor

Example: One hypothesis you might be interested in is whether there are voxels/brain regions whose response is modulated by the reward magnitude (e.g., higher activity for larger rewards, or vice versa)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

IMPORTANT trade-off

A
  • Only one comparison/condition
    > More power, but less generalizability> Stick with this one at the beginning of studying a new field
  • Many comparisons/conditions
    Low power, but more generalizability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Estimation vs. detection

A

Detection: you want to know whether different conditions activate voxel activity differently > Is there signal? > magnitude / amplitude!

Estimation: You want to investigate how different conditions influence voxel activity not only by investigating the “amplitude” parameter of the HRF, but also parameters relating to other properties of the shape of the HRF (like width, lag, strength of undershoot, etc.). > What is the shape of the fMRI signal?

27
Q

Detection properties

A
  • Better with blocked designs
  • Estimates only ONE beta parameters for the stimulus regressor (reflecting activation/deactivation)
  • Usually, employes canonical HRF convolution
28
Q

Estimation properties

A
  • Better with event related designs
  • By employing FIR models, it estimates betas for each set time point for the stimulus regressor (reflecting activation/deactivation)
  • Usually employes FIR (finite impulse response) models
29
Q

FIR models

A
  • finite impulse response model
  • less biased than canonical HRF, but more variable
  • makes no assumptions about the shape of the HRF
  • esimtaes a beta for each set time point: more computationally expensive, if our prior assumption about shape is correct, then it is very efficient, otherwise very inefficient
30
Q

Constrained basis set

A
  • Can contain any number of parts / functions
  • Lower the number of regressors, BUT…
  • …constrained to the shape that are reasonable for the HRF shapes
  • so it is like an in-between of the canonical HRF and the FIR models
31
Q

Advanatges/Disadvantages of canonical HRF

A

ADV:
* simple analysis
* one parameter per predictor
* easily interpretable
* simple

DIS:
* strong assumptions about shape
* biased if it estimates the shape of the BOLD incorrectly

32
Q

Advanatges/Disadvantages of FIR ( = unbiased basis set)

A

ADV:
* not biased towards specific shape, weak assumptions about shape
* allows testing of HYP about specific HRF parameters
* can model the width, height and time delay of the BOLD response quite well

DIS:
* less powerful ( because we have more parameters (?) )
* makes group analysis more difficult (because every subject would have sooo many more betas?)

33
Q

Limitations of BOLD linearity

A
  • BOLD response towards short stimuli stronger than predicted
  • Apparently not possible to deliver only ‘some’ oxygen
  • Stimuli shown for less than 1-2 seconds behave
    as lasting 1-2 seconds
  • BOLD response saturates at particular moment
  • Longer duration of stimulation does not further add to the response
  • Modeled by including a maximum response
34
Q

Bias/Varinace equal…

A

Validity/Reliability

35
Q

Validity

A

internal/external/construct
* is the observed effect due to our manipulation?
* do our results generalize?
* are we measuring what we want to measure?

36
Q

Reliability

A

How robust is the data? (task, design, contrast, thresholding can all affect reliability)

37
Q

Efficiency

A

The chunk of variance we can look at

38
Q

T-statistic

A

We use it to evaluate whether a beta parameter differs significantly from 0 / whether a voxel activates/deactivates significantly in response to one or more experimental factors

39
Q

Uncertainty

A

given the parameters we want to test, we need to test them in the context of their uncertainty

40
Q

The raw value of the beta parameter(s) and the MSE are dependent on the…

A

…scale of the variables (we can rescale either X, or y!)

Therefore, how can we interpret the effects of our predictors?

41
Q

How can we get interpretable parameters?

A

By dividing the raw betas by a value that quantifies how well our model describes the data

The effect (beta_hat) divided by our uncertainty in the effect (SE)

42
Q

If the effect is large, but the uncertainty is also large (bad model fit), then we get a [low/high] t-value

A

low

43
Q

If the effect is large, and the uncertainty low (good model fit), then we get a [low/high] t-value

A

high

44
Q

Standard error (SE) of the model comprises two components

A

Unexplained variance of the model (noise) * the design variance (the variance of the parameter due to the design)

45
Q

The SE(beta_hat) equals the…

A

var(beta_hat)

46
Q

SE(beta_hat) = … (formula)

A

sqrt(noise * design variance)

They scale eachother, both should be small for a high t-value

47
Q

Design variance can be attenuated by controlling the…; noise can be attenuated by controlling the … .

A

experimental design;preprocessing

48
Q

Noise formula (sigma^2) and predictor variance formula (var(x))

A
  1. sum of squared errors / df
  2. sum of squared difference between each x sample and the mean of the x values / df
49
Q

Efficiency

A

inverse of design variance; 1/design variance

50
Q

What causes a low design variance (and hence high efficiency in desing?)

A
  1. predictors should have high variance (they should vary a lot relative to their mean)
  2. predictors should not have high covariance (they should not correlate between each other a lot)
51
Q

Extended formula for efficiency

A

var(Xj) + var(Xk) -2 * cov(Xj, Xk)

We want the sum of the variances to be high compared to the scaled covar

52
Q

T/F: the absolute value of efficiency is not interpretable, but we can interpret it relative to other designs

A

True

53
Q

The matrix version for the efficiency formula

A

c(X.T * X) ^(-1) * c.T

works for both testing betas against eachother and one beta vs. baseline

54
Q

The effect of predictor variance on the design variance/efficiency

A

The more values of a variable deviate from its mean on average, the more variance the variable has

55
Q

The effect of predictor covariance on the design variance/efficiency

A
  • The higher the covariance, the lower the design efficiency
  • High covariance between predictors causes uncertainty of the estimation of our beta-estimates
  • If the predictors are correlated, the GLM “doesn’t know” what (de)activation it should assign to which predictor; this causes a higher design variance term
56
Q

P-value on t-stat

A

area under the curve of a t-distribution associated with our t-value and more extreme values (week 2)

57
Q

Types of t-tests

A
  • Using contrast between parameter and baseline

Does my predictor have any effect on my dependent variable?

  • Using a contrast between parameters

Does predictor 1 have a larger effect on my dependent variable than predictor 2?

  • Why do we use contrasts?

Not only because they are “elegant”, but also because they allow us to specify the exact hypothesis, we are interested in

58
Q

F-tests

A

Used to calculate a significance value for multiple contrasts

Example: “Does the presentation of any circle (regardless of colour) activate a voxel compared to baseline?”

59
Q

How can we maximize efficiency?

A
  1. Come up with different design matrices
  2. Make sure we are maximising for the correct efficiency; if we care about HRF shape, we want to maximise efficiency for ESTIMAITON!
60
Q

I didn’t know this about contrasts (but it makes sense)

A

When we are comparing between conditions, the 1s or any type of value in the contrast should sum to 0, but not when we are comparing to baseline.

Example: [0, -1, -1, -1] is interpreted as: are the voxels on average negative (and the opposite would apply if all the 1s were positive)

61
Q

ISI (interstimlus interval)

A
  • Usually, researchers vary the ISI from trial to trial > jittering
  • Drawing ISIs randomly from a normal distribution > this may yield more efficient designs by reducing covariance and increasing predictor variance
  • NEVERTHELESS, jittering does not always improve design efficiency, but by injecting randomness, it allows for a larger variety of designs, which also include designs that happen to be more efficient than the fixed-ISI designs.
62
Q

Double dissociation vs. Separate modifiability

A

DD assumes that two tasks (A and B) both activate two areas, but task A activates brain area1 more than B while B activates brain area2 more than A: SM states that A (but not B) activates brain area1 and vice versa for brain area2.

63
Q

Connection between simple subtraction and pure insertion

A

Applying the subtraction method assumes that pure insertion holds: that is, a mental process can be inserted into a stream of processes without affecting other processes