Bashivan Flashcards
What does the classical neuronscience approach allow us to do?
It allows us to CLASSIFY neurons according to:
1. Morphology → how to dendritic trees look like?
2. Function → more different functions deeper in visual pathway
3. Firing pattern → bursts, frequency
Etc.
*Every knob is a feature (many many knobs)
What is one of the importance uses of functional classifications of neurons?
(Classic approach)
Entorhinal cortex → Grid cells vs Border cells
Hippocampus:
- Place cells (being in a single spot in the room)
- Object-vector cells (respond to single object in that roome)
- Splitter cells (respond to being in a specific location but only when about to turn left)
What are advantages and disadvantages of the classic approach?
Advantages:
- Describes what individual neurons contribute to the computation
Disadvantages:
- Typically function is considered in specific setting → would have to consider ALL possible settings
- Small populations of neurons are considered → limited amount of cells you can record from
- circuits and mechanisms have to be deducted based of intuition from a very limited amount of cell information
What is the difference between deep learning and machine learning?
Machine learning:
Input → Feature extraction by a domain expert → Classification → Output
*Domain expert defines key features on top of which classification fits
Deep learning → Feature extraction + Classification → Output
*The network itself establishes the features and classifications directly from the data (in the learning process)
What are the “4 knobs” of deep learning?
(4 components on which the design is focused)
- Architecture
- Learning objective (cost function)
- Learning rule
- Dataset
What are the different types of architecture a neural netowork can take?
1-2 are best for image inputs
1. Multilayer Perceptrons:
Every circle/neuron of 1 layer is connected to every neuron of both adjacent layers (not connected to other of the same layer)
- min 3 layers: Input → Hidden → Output
- Large number of weight parameters need to be trained
2. Convolutional Neural Network
1 nose detector goes through the whole picture (convolution), no need for different nose detectors for different areas (check for specific patterns over all the image → extract features)
- Recurrent Neural Network
- Accepts input from outside + Generates its own input
- Sequential output feed back into the network - Transformers
- Used for ChatGPT, etc.
*Parameters = connections between neurons
What are the different cost functions of neural networks in deep learning?
*These are ways to learn/change parameters to improve the output
1. Unsupervised objective functions
- NO teacher
- For cross-modal consistency (read → write, Hear → talk) ~ generative consistency
- For future predictions (transformers trained to do that) → predict an image of a car moving in 2 secs, predict next work in a sentence
- may fails to discover properties of the world that are statistically weak but important for survival (need supervised for that)
- Supervised objective functions
- Teacher that tells if right or wrong output
→ Object recognition, object detection, source localization (of sound)
- The network will change its parameters based on the feedback from the teacher to output ot maximize the odds of outputting the right answer next time - Reward-base objective learning
- Agent → action → Environment → reward/state → Agent …
- 2 interactions (with envrionment and with reward)
- Agent tries to maximize the reward
*What cost functions does brain optimize?
*What do cost functions look like in the brain?
What are 3 ways of representing costs of neural networks in the brain?
- Genes → every neuron in its genes has encoding of what it needs to do?
- Cost-encoding Neural net (smaller networks):
- Ouput layer explicitely computes error (try to satisfy this) - Task-performing Neural net (larger networks):
- Implicit encoding of cost
- Cost embedded in task performance (ex: vision, decision-making)
What do we know about learning rules in the brain neural network?
*For the synapses that are potentiated
1. Changes in size of synaptic connections
2. Perforations caused by LTP
3. Multiple spines buttons (1 pre for multiple post)
To relate to neural networks:
Before training → all nodes have the same weight (all connected equally, parameters)
Training modulates the weight of different parameters/connections
Wtrained = WInitial + ∆W
What are types of datasets?
- Images → Image Net (millions)
- Video games (tens)
- Hundreds of billions pages in text → Common Crawl (to train language models)
What are the main takeaways of the DL approach that are different from classical approach
- No unit typing → units have ubiquitous functionality
- Emergent properties → unit’s functional diversity emerges through learning (no need to specify every circuit by hand)
- Distributed processing → groups of units are orchestrated to facilitate internalized or externally-imposed objectives
- Behaviour is not focused on single unit, focused on global network function and performance
Which future questions could the DL framework allow us to resolve?
- Investigate relations between neurons across regions
- How representations come to be, how neural network give rise to them
- How do we recognize faces? (can’t test experimentally) - Teological explanation for existence of representation
- Explain why things exist in the brain → behaviours, anatomical features, evolutionnary pressures
Why is object recognition such a challenge for the brain?
*Difficult computational problem
1. Have to consider the infinit amount of ways an object can be shown to us and be able to distinguish it every time → different sizes, angles, colors, backgrounds, etc.
- Extrapolate that problem for 1 object → to the thousands of object we can identify
*Ventral visual pathway solves this problem
How long does it take for the brain to discriminate different objects?
Information gets to IT cortex ~ 100ms → this area discriminates different objects
~ 40ms → LGN
~ 50ms → V1
by 10ms every cortex higher
V1 → V2 → V4 → PIT → CIT → AIT
What are the first cells in the visual system to communicate via Action Potentials?
Retinal ganglion cells
~1.5 million in monkeys
Photoreceptors and Bipolar cells do not send APs
What are pinwheels in V1?
They are points around which if you turn (closely), you will encounter selectivity for all orientations
What different regions are found in IT?
- Big region specific for faces
- Other region selective for body parts (EBA = extrastriate body area)
- Region selective for scenes (external or internal landmarks) → responses are anti-correlated with face selective regions responses
What is population coding?
It is a way of looking at groups of neurons and their activity instead of single neurons
- Imagine an N-dimensional space → recording from N neurons simultaneously
- Each point/vector = 1 stimulus placed according to how much response it induces in each of the N neurons
- This N-Dimension space contains 1 line for each object with the infinit points on this line corresponding to all possible representations of this object
The ventral stream transforms this lined from being very curved → more linear as go higher up in cortical areas
For what type of information do you have to look at groups of neurons to learn about? (not found in single neuron’s activity)
3D scale, Z-axis rotation, height, width, perimeter
*response to these informations increase as we go up the ventral stream
→ They are category-orthogonal object properties (don’t fit in neuron categories)