week 3 - Efficient Coding I Flashcards
What is R replaced for in the efficient coding hypothesis?
R = f(S)
the tuning curves f to be optimized
What are the two parts of the efficient coding hypothesis?
-a group of neurons should -> ENcode as much information as possible
“ -> remove as much redundancy as possible
In the equation r=f(s),
What is f?
How does f arise?
What SHOULD f be?
f= descriptive approach
arisen from development/learning
should be a normative approach (normal distribution)
How is information theory involved in the efficient coding hypothesis? what is the equation?
to mutual is used to maximize information encoded: I(S;R)
where
I = mutual information
S = signal
R = neural responses: to be optimised
Why are natural images assigned a high probability in our own natural image space?
because natural images follow spatial and structural trends which we assign as predictable
Why are non-natural images assigned a low probability in our own natural image space?
non-natural images have lots of random noise and uncorrelated pixels etc making them harder to predict and store
What is the chance of generating a natural image in a image space (like not in our brains)? Why?
very small probability that you will get a natural image - the vast majority of images are just noise
To generate a natural image from an image space is it better to use higher order statistics?
best to use only up to second order as it gets too complicated because information processing would take really long as there would be increasing amount of data
In a NATURAL image, are pixels independent of each other?
Which statistical process treats pixels as independent from each other?
no there is a correlation between pixels
-first order statistics
What explains why pixels are not independent of each other?
- if you delete eg 1% of pixels then eg 40% you can still kind of see the picture -> thus pixels aren’t independent
What does second order statistics tell you about pixel correlation in natural images?
pixels are NOT independent of each other:
x and x+1 (pixel adjacent) -> strong positive correlation
x and x+2 -> positive correlation but less strong
x and x+3 -> even weaker pos correlation
What is the caveat of using second order statistics to generate natural images?
This is a GOOD method apart from at borders between light and dark (contrast): between dark and light e.g. dark tree stump and white sky background
Is second or first order statistics better at generating natural images?
second order
How does the Fourier domain relate to natural image generation?
fourier decomposition: a negative correlation between
High freq when pixels are changing in brightness very rapidly (as you move from pixel to pixel)
X axis = frequency
Y axis = how often they occur in natural images = power
Natural images have slow changes/slow frequencies
This curve is called = POWER LAW . This law shows up in nature a lot.
Which probability distribution is used to help generate natural images? How does the distribution do this?
Gaussian/Normal
by capturing the correlations between pixels in natural images?
What is the simplified equation for Gaussian image model?
p(x)=N(x|μ, ∑)
where
N = normal dist
x = pixel
μ = Vector containing mean pixel intensities
∑ = Covariance matrix describing correlations between pairs of pixels
What shape does a Gaussian image model in a graph of pixel brightness of neighbouring pixels x1 (xaxis) and x2 (yaxis)?
elipse which is orientated in to show positive correlation on graph
How do we maximize information when generating natural images?
by assuming that natural stimuli come from this simple distribution (efficient coding hypothesis
Using the efficient coding hypothesis, how do we maximize information from a LINEAR Gaussian image model?
simplify information measures to normal dist:
Entropy -> Variance
Redundancy -> Correlation
thus we must maximize variance & minimise correlation to achieve the effiecient coding hypothesis
In a linear Gaussian image model, when entropy is simplified to variance, is a fat or a skinny bell curve of a normal distribution better for efficient coding?
FAT:
high uncertainty, high information, high entropy -> high variance (thus maximizing information and better for efficient coding hypothesis)
SKINNY:
vice versa
Why do we want to minimise the correlation between neurons in a group of neurons?
redundancy -> correlation thus would be good for efficient coding
Overall, what would be the most efficient strategy to maximize information in a LINEAR Gaussian image model? why?
Take correlated pixel inputs, and transform these inputs into DEcorrelated neural activity
because decorrelation removes redundancy =efficient coding