Neural network models Flashcards
What is a feedforward neural network?
data flows in one direction from input to output through hidden layers comprised of units/neurons each connected to everything in the previous layer via weighted connections
What is a perceptron?
single layer feedforward neural network: takes multiple outputs, computes weighted sum (multiply each input by weight), produces singular output
What is gradient decent?
optimisation algorithm to minimise error/loss, adjusts weights and other parameters in direction that leads to greatest decrease in the error, like decending down a hill by taking small steps in steepest direction
Explain parametric versus non-parametric models
parametric are a fixed size, fixed number of inputs and outputs, learn set of parameters to map inputs onto outputs. non-parametric models grow as number of inputs and outputs grow, encodes values like a table or big list
Explain linear versus non-linear problems/networks
linear problems need straight line to solve, non-linear need a curve. linear activation functions output directly proportional to input but non-linear doesn’t have to be. E.g. step function like light switch but sigmoid like volume knob
What is translation invariance?
ability to recognise objects regardless of spatial location, scale or orientation, achieved in visual system and CNNs by heirarchal processing, spatial pooling and feature detection
Why do regular neural networks lack translation invariance? Why are CNNs better?
FNNs are fully connected and lack weight sharing - only learn patterns independently at each spatial location. CNNs have local connectivity and weight sharing - convolutional layers where neuron connected to local region of input and filters applied across different locations
What is a CNN?
type of FNN specialised for processing visual data/images, distinguished by addition of convolutional hidden layers - local filters convolve across image matrix to prodice feature map, layers build upon features of last to extract more complex features in heirarchal structure
Explain dimensionality
number of features in input data, number of neurons in each layer. more dimensionality = more complex representations but also more computational costs. activation functions can increase by introduce non-linearity, pooling reduces
Explain adversial networks
networks trained to find images that CNNs classify incorrectly, incrementally adjusts image until it maximally resembles image of different class but without losing its class label
Why are CNNs a good model of the ventral but not the dorsal stream?
can classify objects like the “what” pathway but are usuallly for static images and can’t do motion like “where” pathway and don’t understand what objects are like “how” pathway.
Explain spatial and temporal heirarchies
how processing organised in brain- spatial goes from simple features like lines up to complex objects, temporal goes from shorter to longer temporal windows. CNNs have spatial but not temporal
Why are CNNs a good model for the visual system?
image processing, heirarchal feature extraction, local filters analogue for retinal receptive fields, translation invariance, multiclass classification from probability distrubution of classes, pooling to reduce dimensionality
What are the limitations of CNNs as a model for the visual system?
CNNs feedforward only but visual system has feedback loops from higher cognitve areas, still struggle with generalisation as limtied to training data, vulnerable to adverserial methods, only spatial not temporal processing, only ventral not dorsal
Explain delay preiod activity
activity in area of what trying to remember when trying to remember it - cells fire persistently in spatially selective fashion thought to be substrate for short-term integration/working memory. seen in dorsolateral prefrontal cortex of macaque
Explain temporal integration in the visual system
normative models using log liklihood suggest accuracy of decision grows with number of samples as noise averages out to zero over time. also descriptive as humans and monkeys perform better at tasks like direction of motion for dot cloud under longer duration, lateral parietal cortex activity reflects adding up of information/evidence for particular response
Contrast recurrent models and RNNs
models where information loops back on itself, e.g. drift diffusion linear, wang non-linear one of responses win race to decision threshold. RNNs are neural networks using model but have freely trainable parameters
What is an RNN
process sequential data by maintaing internal memory. interconnected layers of neurons but each neuron also connected to itself. feedback loop allows information to persist from one time step to next
Why are RNNs a good model of memory/visual system?
process data over time so able to do action selection based on past and present data not just present. activity of hidden units in RNN resemble mixed selectivty of neural data for dot motion stimulus
What are the limitations of RNNs as a model of memory/visual system?
computationally costly and biologically implausible backpropagation through time
What is the issue with temporal credit assignment?
difficulty assignign credit for outcomes that may occur several steps after input that provoked them e.g., working out who infected you when became ill with virus that has long incubation period
What is backpropagation?
method of updating weights in neural network. error (difference between output value and target value) passed backwards through network, optimisation alogrithm such as gradient decent used to adjust weights to minimise error
What is backpropagation through time?
have to propagate back through time steps as well as layers, adjust weights at each time step based on how contributed to errror at end, unfolding network. but can’t unfold all of it so has to be a cutoff/truncation window
Explain these parts of a neural network:
weights
activation function
bias term
weights = random numbers adjusted by leaarning rule to reduce error, weights sort of represent relative importance/contribution of each input to output. activation function = converts weighted sum into an output, determines whether neuron sould be activated or not based on threshold. bias term = controls position of decision threshold for activation function, can choose certain value if know what need/testing or to initialise and gets updated by network similar to weights