week 5 - efficient coding III Flashcards
What is the ECH? (two parts)
a group of neurons should encode as much information as possible OR remove as much redundancy as possible
What is the equation for ECH? Maximize…?
What does each component mean?
Maximize I(S;f(S))
I = mutual information
S = signal
f(S) = tuning curves to be optimized
What are the efficient coding (EC) parts of Whitening and ICA?
Whitening
EC = decorrelating pixels/data
ICA
EC = demixing to recover independent components
What model does ICA build apon?
ICA builds on whitening model
What are the steps in the process of whitening? (2 steps)
Whitening:
-Fit Gaussian distribution (correlates neighbouring pixels)
- Decorrelation pixels (EC part)
What are the steps in the process of ICA? (brief)
-fit more complex model (from non-Gaussian component)
-mix model (makes it Gauss)
-demix model to recover independent components (EC part) ->now non-Gauss
What is the problem with local codes? (assign one neuron to a concept)
What is the problem with dense codes? (assign one concept to many neurons)
What is the solution to these two problems ^?
-when this neuron dies, do you forget about this concept? no eg. grandma
-very robust however would cost a lot of energy
-use spare, distributed codes
What are the benefits of sparse, distributed codes?
maximalise memory storage but also save energy
What is kurtosis?
What does kurtosis describe?
a statistical term which describes the shape of the probability distribution curve
it describes the ‘taildness’, the prescence of outliers and shape of the peak
What do probability distributions with positive kurtosis look like?
sharp peak
heavier tails/more outliers
What is positive kurtosis aka?
super-Gaussian
leptokurtotic
What is the equation for the encoding model?
What does each component mean?
r = Ws
r=neural responses
W=weighted receptive fields
s=natural image pixels
What is the decoding model?
s=W(-1)r
(-1) is to the power of minus 1
What is the equation for the sparse coding model?
What does each term mean?
E = -[preserve information] - λ[spareness]
preserve information = the error term
In the sparse coding model equation, what does is the preserve information term mean mathematically?
preserve information = mean squared error
(this is the reconstruction error)
In the sparse coding model equation, what type of function represents the sparseness term?
sparseness = a function that penalizes NON-zero values
(make any negative values into positive values) f(x)= I x I
What do sparse filters look like?
What characteristics do they have?
like receptive fields in primary visual cortex V1
they are localised, orientation-specific and Gabor-like
What other type of filters do sparse filters look like?
like ICA filter
Which two filters look like V1 receptive fields?
sparse filters
ICA filters
What is the difference between the data/images used in ICA and in sparse coding?
ICA = mix of independent components with non-Gaussian stats
Sparse = has super-Gaussian response statistics
What does the sparse filter do?
maximalise sparseness
Will the neural response properties of ICA be Gaussian?
NO! non-Gaussian (recover independent non-Gaussian components)
as it is the task of the brain/neural response to demix the signal
Why is it desirable to have super-Gaussian statistics for the neural response in sparse coding?
because super-Gaussian stats are desirable because they maximalise information even when there are energy constraints in the nervous system
What is a criticism of the definition for sparse coding?
it is a bit vague: Are neurons’ responses SPARSE across population or time?
To criticise sparse coding, how is it overly simple?
the brain is more complicated than binary networks eg. brain has inhibitory and excitatory neurons
What is a criticism of sparse coding?
it focusses on memory storage as the limiting factor of the brain but maybe generalization is more important to focus on
How to neural responses react to natural sounds (birds, waves) and speech in humans?
neural responses are specifically adapted/tuned to natural sounds, but very similar tuning properties emerge for speech sounds!
Potentially, what does the strength of cortical magnification depend on?
What area have high cortical magnification?
strength of cortical magnification depends on neural resource limits
Strong magnification: particular sensory region receives an exceptionally large number of neurons in cortex RELATIVE to its physical size or sensory receptor count
-fovea, fingertips
How can efficient coding be applied to dynamic problems in the human body?
problem: olfaction receptors can regenerate
efficient coding can be applied to this dynamic problem
How does the ECH run into problems when applying it to behaviour (in humans)?
EC maximalises ALL information indiscriminately however this is not true as some stimuli are more behaviourally relevant than others. For example, a human would react more to a tiger than a flower.
Thus ECH is not representative of natural human behaviour
What is reverse efficient coding?
-calculate (presumed) stimulus statistics from neural responses
-this is instead of calculating the optimal neural responses from stimulus statistics
What the are the assumptions needed for reverse efficient coding?
you can do reverse EC assuming that these stimuli are encoded efficiently
What does the curve look like for reverse coding when stimulus is on the xaxis and neural response on the yaxis? What does it show?
sigmoidal - as stimuli increase then