Data Analysis Flashcards

1
Q

What is spike sorting?

A

After monitoring the activity of a large number of neurons simultaneously:

Spike sorting is knowing which neurons fired when

How do we classify single neurons from other neurons and noise

which neuron fired when

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many neurons does EEG capture?

A

millions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the regional domain of EEG?

A

3-5cm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Two types of recording for single cell?

A
  • Wide band continuous recording - continuous

* Filtered, spike-triggered recordings - action potential crossing threshold would trigger the recording

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is High Pass Filtering?

A

allowing high freq to pass through and discarding lower freq

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

why high pass filter?

A
  • Local field (the group activity from 100 and 1000s of neurons) potential is primarily at low frequencies - so get rid of this noise.
  • Spikes are at higher frequencies.
  • So use a high pass filter (800-1000 Hz cutoff is good)

• Preferably a non-causal filter
(causal filter changes the signal slightly - might produce phase distortions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do we tell what is a spike and what is not?

A

• Mostly used is a voltage threshold

• Check the quality of detection
– Visual inspection of spikes superimposed (Study interspike interval histogram)

then cut an epoch around the peak - to get individual neuron

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Aids to understanding what is and what is not one single neuron?

A

No spike interval less than 1 ms (spike refractory period)

  • this condition is only applicable once the spikes are sorted
  • but this indicates that if two come in sucession <1ms - has to be two signals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The choice of threshold is a trade-off between

A

a) too high threshold (Type II error)

b) too low threshold (Type I error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how many standard deviations away to distinguish spikes?

A

5

but you can also apply modified using the median of the spikes … is more realistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Feature Analysis?

A
  • Two clear action potentials that have roughly the same height but are different in shape.
  • If the shape could be characterized, we could use this information to classify each spike.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we characterize the shape?

A

extract the features

each time a neuron fires, it has the same features - like people with a distinct voice. So we can charcterise them and separate them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the main features?

A

Peak amplitude

peak-to-peak amplitude

peak width

energy etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are more advanced methods for spike sorting?

A

Principal components analysis (cousin of factor analysis)

Wavelet transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does Principal components analysis work?

A

The idea behind PCA is to find an ordered set of orthogonal basis vectors that capture the directions in the data of largest variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How is Wavelet transform better?

A

Wavelet transform based feature extraction

- localized properties (i.e. spike shape details) are emphasized - More adaptive than PCA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Which feature is used to discriminate spikes?

A

Select those features that best separate the different clusters of spike shapes.

A wavelet coefficient -or any other feature- that is good for distinguishing different spike shapes should have a multimodal distribution, unless there is only one cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is Clustering?

A

Group spikes with similar features into clusters, corresponding to different neurons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is manual clustering?

A

Manual clustering

– Drawing polygons in 2-D projections of the spike features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

issue with manual clustering?

A

– Prone to error, subjective bias, time consuming, not practical for high dimensional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is an alternative clustering approach?

A

Nearest neighbor k-means clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

features of Nearest neighbor k-means clustering?

A

– Define the cluster locations as the mean of data within that cluster
– Cluster membership is defined by Euclidean distance
– This defines a set of implicit decision boundaries separating the
clusters
– Works well when the clusters are separated but not otherwise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

name a Distribution free approach?

A

Super paramagnetic clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

main points for Super paramagnetic clustering?

A
  • Groups the spikes into clusters as a function of a single parameter, ‘the temperature’.
  • The approach is based on tuning this temperature
  • For low temperature, the data are grouped together to a single cluster, and for high temperature, the data are dispersed to too many clusters
  • For a middle range temperature, corresponding to super paramagnetic regime, the data are optimally sorted into a few clusters but with large membership
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is Rate Estimation?

A

how many times a neuron fire in a second

Most widely used estimate of neural activity = Average firing rate over a time interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Rate Estimation is usually expressed by a….

A

Peri-Stimulus Time Histogram (PSTH)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How to make a PSTH?

A

event related neural firing

  • Align spike sequences with stimulus onset (or any event) which repeated n times
  • Divide the stimulus period S into N bins of size D

• Count the number of spikes (ki) within individual bin for
all trials

• Compute the histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

what is Inter-Spike Interval (ISI)?

A

A temporal coding - when it fires , not by how many times.

Calculate individual firing - and the distance between

Neuronal firing pattern is considered to have a Poisson distribution: a renewal process.

traditionally, the ISI sequence is assumed to be random ! assumption behind rate coding ….. if so, the structure would be random ….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

how to calculate the temporal variability of the firing sequence? characterise the stat nature ?

A

Fano Factor (FF)

30
Q

what is Fano Factor (FF)

?

A

• An index of ISI variability

Defined as the variance of the number of spikes divided by the mean number of spikes for a given time window

if there is no history in the firing sequence - random - the outcome doesn’t depend on previous outcomes. If this is the case, then we expect this fano factor the be close to 1

31
Q

however… there has been evidence to suggest ….

A

fractal properties within neurons (memory) if this is true then Fano Factor >1

32
Q

in long range processes …. xxxx xxxxx matters

A

past history

33
Q

these dont follow a gaussian dist, but a …

A

levy

34
Q

There are more way of quantifying these long range properties of neurons…. name 4

A
  • Fluctuation Analysis
  • Multi-scale analysis
  • Statistical Control
  • Simple analysis of moments
35
Q

Fluctuation Analysis

A

Slide with red, green, blue Taking random noise, Brownian noise (accumulated, fractal properties (long range process)

36
Q

Detrended Fluctuation Analysis (DFA)

A

slide 33

The deviation of a from 0.5 characterizes the strength of non-random fluctuations

a statistical approach to finding out any pattern in the sequence? Dependent on something in the past? (structure in the data)

1) do spike sorting
2) ISI
3) simple accumulated integration
4) then calculate accumulalive integration (blue line)
5) devise epoch into smaller windows (red line with regression and error)
6) plotting the log of the size of the boxes and log of the size of the error
7) calculate the slope
8) slope give you scaling exponent

> alpha near .5 = random
if alpha 0 = anti correlated
if alpha is 1 then fractal (infinite history)
Long-range correlated (LRC) sequence, 0.5<a></a>

37
Q

how to test statistically

A

randomisation test

38
Q

what is Multi-Scale Entropy (MSE)?

A

MEMORY in firing sequence of NEURONS (this bit is so cool)

if long influence from long past - then look in small window of time, you might see some information in random fluctuations.

If you look at longer time - you might not see anything

but the more you incorporate the past in to your viewpoint, the more interesting structure.

so look into:
• Construct coarse-grained sequence:
(1&2 average - 2&3 av and so on)

• Compute entropy value (Sample Entropy) for each coarse-grained sequence
(amount of information - bounds between order and disorder)

• Profile of the entropy as a function of (course graining) length scale (t) is called multi-scale entropy

If process has structure at the global level then we have long term memory - if it only has structure at local level, then we no structure (past history) at the global.

Fractal properties = the more you look into it, still the past brings more information

39
Q

Analysis based on Moments means…

A

For processes with long-range correlation

– Slow convergence of statistical moments
– More irregularities of statistical moments
Otherwise, quicker convergence and steady state behavior of statistical moments for renewal process

Like a sanity check

gaussian (more data - variance get lower)

but

if process is not gaussian (more info more it changes)

40
Q

But the temporal dimension mentioned hitherto is not…….

A

a suitable substrate for memory storage

41
Q

what is an alternative ?

A

space

42
Q

how would one show the relationship with space?

A

by linking neurons found with Detrended fluctuation analysis or Multi-Scale Entropy

i.e. spatially correlated - a connected network

43
Q

they found a

A

‘spatial - temporal’ coupling

in action potential and dendritic connections

44
Q

what is a long-range power law correlation?

A

the hallmark of self-organised criticality (non-randomness emerges from random initial conditions and random input)

– spatially extended system 
– local interactions 
– slowly driven 
– dissipate and non-linear 
– flexible to alterations 
– robust against initial conditions 

(HIPPOCAMPUS EXHIBITS ALL OF THESE)

45
Q

now we switch to ERP analysis. How do we quantify a waveform?

A

measuring

1) amplitude
2) latency

46
Q

three main routes to discriminate amplitude?

A

– Peak amplitude

– Mean amplitude

– Peak-to-Peak (not each peak individually - difference from one peak to another)

47
Q

three main features of latency?

A

– Peak latency
– Onset latency (when a component starts to occur)
– Fractional area latency

Component measurement and subsequent quantifications are done usually at single subject level for the purpose of subsequent statistical analysis

48
Q

what is the peak amplitude approach?

A

• = Maximum amplitude in a chosen time window.
• Problems
– Prone to (high frequency) noise contamination
– Estimation depends on the number of trials used for
averaging
– Easily influenced by an overlapping component at the
border of the measurement window
– Essentially a nonlinear measure prohibiting direct comparison between two grand-averages

(peak of time window - maximum value = more trial the better as against noise, more for average / individual and grand average will not be the same - non-linear as seq matters)

49
Q

what is the Mean amplitude approach?

A

• = Mean potential in a chosen time window
• If the sum of the potential in a time window is computed, it is called an area amplitude measure.
• Advantages
– Narrower time window could be considered
– Less sensitive to high frequency noise because a range
of time points is used rather than a single time point – Allowscomparisonswithdifferentnumberoftrials – It is a linear measure

(chose window - calculate average amplitude - advantage is a linear measure - less sensitive to noise)

50
Q

What do both peak and mean amplitude need?

A

Baselines

• Both peak and mean amplitude measures are with respect to a baseline (assumed to be a zero potential)

• Therefore, choice of baseline is an important issue in ERP research
– Use the average of 0-200 ms pre-stimulus period as the baseline
– Too short a baseline introduces noise, and too long a baseline introduces “signal”

51
Q

what is Peak-to-Peak Amplitude Measure?

A

• Measure a peak relative to an adjacent peak (or trough) in the waveform
• Advantages
– Useful for overlapping components – Less prone to noise
• Disadvantages
– Difficult to interpret (i.e. for (X1-Y1) > (X2-Y2) we cannot be sure if it is due to change in X, or in Y or in both)

52
Q

what is peak latency ?

A

• The time (latency) at which the component reaches its maximum (or minimum)
• Shares all limitations with peak amplitude
• Suggested tips
– Filter out high frequency noise before calculation
– Use a local peak measure (a point is not considered a peak unless 3-5 sample points on each side of it have smaller values)

53
Q

what is Onset Latency (OL)?

A

• Relevant for response locked averaging component (e.g., to find when a preparatory motor activity has started)
• Different ways to find onset latency
– Choose an absolute threshold (mV) and the OL is the time point at which the threshold was first exceeded
– Instead of an absolute threshold, choose an adaptive
threshold (i.e. 2 std higher than baseline)
– Two regression lines are fitted to baseline and to signal, OL is the intersection

54
Q

what is Exploratory ERP Analysis?

A

essentially running individual t tests to find regions of interest

defining the temporal difference as ‘cognitively meaningful’ at around 20-40ms

issues with multiple comparisons

55
Q

what is Global Field Power?

A

all channels in one go, together (32 / 64) - temporal variation of spatial standard deviation

• Reflects time profile of the spatial standard deviation
• ERPs with peaks and troughs and steep gradients yield
high GFP and flat fields yield small

• A reference free measure and helpful in detecting latencies of interest

goo data compression technique - where in time window you find stronger differences

56
Q

now switching to oscillations … An oscillatory signal is characterised by?

A

its frequency (f), amplitude (A) and phase(θ)

x(t) = A * sin(2pt*f + θ)

  • f = number of cycles per unit time
  • A = magnitude of displacement
  • θ = initial displacement at t = 0
57
Q

3 way of representing oscillations?

A
  • Time-domain: how does the signal change over time
  • Frequency-domain: how much of the signal lies within each given frequency band over a range of frequencies
  • Power spectrum (fourier spectrum analysis = histogram of freq = power spectrum)
58
Q

issues with only oscillation feq analysis?

A
  • All time information is lost in the frequency- domain
  • What if the amplitude and phase are not constant over time?
  • e.g. There might be an event-related increase in amplitude for a certain frequency
59
Q

what provides some information for all aspects?

A

Time-Frequency Representation

  • There is no spectral information in the time- domain representation
  • There is no temporal information in the frequency-domain representation
  • A Time-Frequency Representation (TFR) of a signal provides some temporal information and some spectral information
  • Typically, there is a trade-off between specificity in time and specificity in frequency
60
Q

how are Time-Frequency Representations produced?

A

through wavelets

61
Q

what is wavelet

A

A complex Morlet-wavelet is an oscillatory signal, localised in time by a Gaussian:

time resolved info of how diff oscillations fluctuate with time

62
Q

how is a Time-Frequency Representation produced?

A

The TFR of a signal is obtained by calculating a ‘moving average’ of the similarity between a wavelets and this signal

time resolved info of how diff oscillations fluctuate with time

63
Q

what is Event Related Synchronization / deSynchronization?

A

used b4 wavelet (not as good)

  • Changes in spectral power
  • Decrease o fband power De-synchronization
  • Increase of band power Synchronization
64
Q

why Event Related Synchronization / deSynchronization?

A

gives us info about oscilations

gives individual evoked (anytime) and induced oscillations

power profile in relation to events/stimulus

ERP is only evoked - assume brain responds in identical way - all information about induced oscillations is lost. Wavelet is powerful at capturing this.

65
Q

key parts of evoked oscillations?

A

Evoked Oscillations

  1. Compute ERP by averaging across trials
  2. Apply wavelet on ERP
66
Q

total oscillations?

A

Total Oscillations

  1. Apply wavelet on each trial
  2. Average across wavelet transformed each trial

total = evoked + induced

67
Q

connectivity is the fundamental ….

A

the language of the brain

68
Q

Three Broad Approaches for analysing EEG/ERP/MEG Signals

A

1) synchrony (linear and non-linear)
2) oscillations
3) ERP

69
Q

IMPORTANT (joy said in exam) - What are the 4 linear methods he mentions? do mnemonic first

A

LIC HER EARCOME HERE IN RANGE OF MULTIPLE MODS

70
Q

LIC HER EARCOME HERE IN RANGE OF MULTIPLE MODS?

A

• Linear Correlation

• Coherence
– Magnitude squared coherence – Partial coherence

• Granger causality

• Multivariate modeling
– Directed transfer function – Partial directed coherence

71
Q

IMPORTANT (joy said in exam) - What are the 4 NON-linear methods he mentions? MNEMONIC 1ST?

A

GAY CORAL REEF FORMS THE ORIGINAL SYNTH PHASE…..GENERALLY SHIT

72
Q

GAY CORAL REEF FORMS THE ORIGINAL SYNTH PHASE…..GENERALLY SHIT?

A

• Nonlinear correlation

• Information Theory
– MutualInformation
– Transfer entropy

• Phase Synchrony
– Hilbert+Shannon
– Mean phase coherence – Wavelet

• Generalized Synchrony
– Similarity index & families
– Mixed predictability
– Cross prediction