Lecture 11: Group Level Analyses and Statistics Flashcards

Question

Problem time for neuroimagning data - (7)

Answer 1

Neuroimaging data are ‘big’ – they involve measurements at many spatial locations and points in time In an MEG study we might easily have something like (248 sensors, 1500 time points, Perhaps six frequency bands, or 50 or 100 different frequencies) If we multiply these together, we are quickly making millions of comparisons! That is 375,000 t-tests in an event-related sensor space analysis Potentially even more in source space (e.g., 15,000 vertices * 1,500 time points * 6 frequency bands) And that’s just for a single condition – we have three conditions and three contrasts between them Could be 6 contrasts * 15,000 vertices * 1,500 time points * 50 frequencies…

Answer 2

The more tests we run, the greater our chances of getting a false positive (also called a Type 1 error) This is where we get a significant effect, even though there is no true difference Likelihood is known as the familywise error rate, given by: FWER = 1 - (1 - ⍺)m Where ⍺ is the threshold for significance (e.g., 0.05) and m is the number of tests This plot is just for 100 tests and the FWER approaches 1 Clearly, we will end up with some false positives if we have 1000 or 1 million comparisons We have no way to tell which effects are real and which are errors

Answer 3

Bonferroni correction (Adjust the threshold for significance) False discovery rate correction (Accept that we will make errors, but control their rate) Cluster correction (Take into account correlations across space/time/frequency)

Answer 4

We adjust the threshold for significance (⍺) by dividing it by the number of tests If we have 8 tests, ⍺ = 0.05/8 = 0.0063 Count a test as significant if its p-value is lower than this This keeps the familywise error rate at or below ⍺ (i.e., still 5% likelihood of a false positive) Very conservative Dramatically reduces our ability to detect true effects (aka power) so we need a very large sample size

Answer 5

Less conservative – more power The familywise error rate is the proportion of false positives expected across all tests The false discovery rate is the proportion of false positives expected across all significant tests We fix the FDR at a known level (e.g., 0.05), meaning we accept that there will be some number of false positives But we don’t let them go crazy e.g., 5% of all our significant results might be false but no more

Answer 6

We rank order the list of p-values from all tests Set the FDR e.g., 0.05, and use to make different significance thresholds for the lowest p-value, and the second lowest, etc. Compare each p-value to the significance threshold for that list position our smallest p values judged against harsher threshold Our biggest p values are judged against a lazer threshold

Answer 7

our different tests are independent, and they are not (at all!) Real effects are likely to extend over several contiguous time points, frequencies, sensors, or brain locations Also our sampling is arbitrary – we can sample the cortex with 1500 vertices or 150,000 (same goes for time and frequency)

Answer 8

* Cluster correction identifies ‘clusters’ = contiguous samples that are all individually significant without correction * Controls the false positive rate at this cluster (not individual test) level * Takes into account the correlations across space, time and frequency inherent in MEG signals * Gives better statistical power

Answer 9

1. Perform a test at each sample (usually a t-test) 2. Identify clusters of adjacent significant tests

Answer 10

choose significant clusters

Answer 11

1. In each cluster, sum the t-statistic across all samples 2. Find the cluster with the largest summed t-statistic 3. Generate a null distribution (the blue histogram) by resampling the largest cluster’s data with random condition labels – on each resample recalculate the summed t-statistic 4. Compare each cluster to this null distribution Those with a summed t-statistic falling outside the 95% limits are retained as significant

Answer 12

1. Perform a test at each sample (usually a t-test) 2. Identify clusters of adjacent significant tests 3. In each cluster, sum the t-statistic across all samples 4. Find the cluster with the largest summed t-statistic 5. Generate a null distribution (the blue histogram) by resampling the largest cluster’s data with random condition labels – on each resample recalculate the summed t-statistic 6. Compare each cluster to this null distribution Those with a summed t-statistic falling outside the 95% limits are retained as significant

Answer 13

+ Maintains statistical power and doesn’t ‘shrink’ clusters + Takes correlations across time/space/frequency into account + Non-parametric

Answer 14

- It is quite slow because of the resampling - - Can’t make strong claims about the edges of a cluster e.g., the significant start and end times in a time course (as wouldn’t necessarily be significant after correction on their own) i.e., might ‘grow’ clusters

Answer 15

Get group averages to visualise MEG results Run simple statistical tests e.g., t-tests to analyse MEG data But run lots of them - one at each time point/ sensor/ location/ frequency Need to address the multiple comparisons problem We can change the alpha level or use cluster correction Also use ROIs, test a priori hypotheses, pre-register, etc.

Answer 16

A - timing of changes in brain activity can be determined in both sensor and source space analyses While sensor space analysis provides information from the sensors directly, source space analysis localizes brain activity and can also reveal temporal dynamics. Both methods can provide insights into the timing of brain activity changes, albeit from different perspectives.