Deep Learning - Dr Bashivan Flashcards
why should you not include every detail in a neural model?
- more difficult to interpret
- lower feasibility
- more difficult optimization
what is the current practical sweet spot for amount of detail integration?
deep neural nets
what is the up and down of verbal explanations?
- easy to communicate!
- has a narrow bandwidth :/
what is the up and down of quantitative explanations? (code)
- easily transferrable, easy communication, can answer questions without costly experiments
- requires coding literacu
what is the classic approach for studying neuroscience?
identify and characterize individual elemnts in the brain (bottom->up approach)
what is the difference between machine learning and deep learning?
machine: figure out a template/feature of what you are looking for and then classify
deep learning: feature extraction + classification happen at the same time
why is the classic approach for studying the brain not so efficient?
only considers one of few tasks at a time, and only a few neurons
give an example of the classic approach
surround modulation and two-interval discrimination
what components is the deep learning framework based on?
- architecture
- learning objective (cost functions)
- learning rule
- dataset (secondary axis)
what are 3 principles of holistic deep learning approach?
- units have ubiquitous functionality
- units’ function diversity comes from autonomous learning
- groups of units are orchestrated to facilitate internalized or external objectives
name 2 static architecture models
- multilayer perceptrons
- convolutional neural network
what is multilayer perceptrons?
each unit in a layer is connected to all the units in the previous and following layer
what in convolutional neuron network?
units are locally connected to subgroups of units
name the 2 dynamic architecture models
- recurrent neural network
- transformers
what is recurrent neural network?
internal memory gets updated based on observations
what are the 3 types of cost-functions strategies?
unsupervised, supervised, reward-based
what is unsupervised objective (cost) functions?
- learn from observations, model reproduces what it sees: predicting errors, continuity, sparsity
- has generative consistency: wake-sleep algorithm, generative neural networks
give an example (allegory?) of unsupervised objective functions
finishing someone’s sentence
what is a downside of unsupervised learning algorithms?
it may fail to discover properties of the world that ae statistically weak but important for survival
how can we solve the problem of unsupervised objective functions not discovering essential properties?
supervised objective functions
give examples of supervised objective functions
object recognition
object detection
source localization
what is reward-based cost functions?
agents try to maximize reward
how are costs encoded in the brain vs in neural net?
brain : genes
neural net:
- cost-encoding neural net (small)
- task-performing neural net (large)
what are the 3 learning rules you can use?
following a gradient, not following a gradient, partially following a gradient
why do we think prefrontal cortex neurons continue to fire during the delay period despite no stimulus?
to keep the novel information in mind
what did they find after making monkeys perform an oculomotor delayed response task?
during the delay period, neurons fire or are inhibited selectively to the cue location
what fraction of PFC neurons shows excitation or inhibition during the delay?
1/3
name the 4 toolkit for model testing
- behavioural agreement
- agreement with neural data
- in silico electrophysiology
- developmental agreement
what is behavioral agreement toolkit?
quantifying the behavioral similarity of our network vs animal network doing the same task
what is agreement with neural data toolkit?
comparing how the model vs animal solves the task
what are 2 different ways of testing agreement with neural data?
- representational similarity analysis: comparing patterns of responses using matrix
- encoding model: compare neuron with a unit
what are the 3 types of in silico electrophysiology
- lesion studies
- decoding (find how a characteristic of the task is encoded)
- selectivity profile
what is developmental agreement toolkit?
performing previous analyses at different stages of learning
explain the artificial neuron model
each dendrite works as an input channel
-> some weight each input and assignes a value to each -> if weighted sum reaches threshold the neuron starts spiking
what was the first neural net?
multi-layer perceptron: extended artificial neuron model into interconnected layers of neurons
what was the limitation of the multi-layer perceptron?
it only considers a limited part of the visual field (units are only connected to the units around the center of the previous layer)
convolution neural networks allow…
Patterns to be distinguished regardless of their spatial location
what is convolutional neural networks?
network learns and applies a quernel / convolution with specific feature and recognizes them in the environment
what is alexnet?
convolutional neural network with 9 layers of convolution, pooling, nonlinearity, normalization
how was alexnet trained?
supervised training with imagenet dataset
what did they find in the first layer of alexnet?
patterns useful for image recognition, similar to those found in V1
what did they use the representation dissimilarity matrix for?
to compare the response of neurons of V4, IT, and CNN last layer to 8 different categories of objects
the 3rd and 4th layers of alexnet CNN corresponded to what macaque brain areas?
3rd layer = V4
4th layer = IT
name 3 ways how alexnet was doing unsupervised learning
- deep cluster: groups its inputs in clusters
- instance discrimination: discriminate between pairs of observations from memory
- contrastive learning: learns to respond similarly to different variations of the same observation
do neural networks have a spatial map?
no, selectivity of the units is completely random. no topography
how large the weight is between 2 units is scaled by what?
the physical distance between those 2 units
as you go higher in the visual pathway hierarchy, what happens to topography?
increased topography organized by categories (more organization)
what are the 2 topographic models propose (models for learning objective)?
- wiring cost minimization (distance between neurons)
- spatial cost function (spreading of neurons)
what does the correlation between response similarity and cortical distance show?
spatial loss hypothesis encourages local correlations