lecture 6 - human-like AI Flashcards
deep learning
allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction
backpropagation
indicates how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer
deep convolutional network
- filtering in a CNN serves the purpose of extracting meaningful features from the input data
- convolution of the filters with the input enables the network to reduce dimensionality, achieve spatial invariance, and capture increasingly compex features as the network deepens
parallels of CNN architecture with the visual system
- increasing RF size going up the model hierarchy
- simple cells: respond to specific orientations of stimuli in preferred image locations, reacting strongly to bars and edges
- complex cells: integrate input from multiple simple cells, enabling broader spatial response and feature detection across the image
- mimicks the representational geometry of the inferior temporal cortex: CNN develops a similar representational stucture to the IT, which suggest deeper layers of DNNs can approximate the human brain’s representational geometry for visual processing
biologically inspired ML systems
- they approach ‘human level’ accuracy in a variety of domains, and even predict human brain activity.
- this raises the possibility that such systems represent the world like we do
major flaws of DNNs (Heaven et al)
-
brittleness:
- they are highly sensitive to changes in input data, which can cause them to make mistakes that humans would not typically make.
- changes such as added noise or rotation of a learned object
- this makes them vulnerable to attacks, as demonstrated by adversarial examples
-
shortcut learning:
- DNNs overrely on pattern recognition, but lack deeper understanding of the world.
- they focus on incidental features, such as texture, background, or locations of objects, rather than the key attributes humans would learn.
- they have no good model of how to pick out what matters, so an irrelevant change in feature input can throw it off.
- this hampers transfer of learning
adversarial examples
- minimal noise added to input data can lead to significant errors, such as misreading a stop sign as a speed limit sign
- humans would see the two images as identical, whereas DNNs see them as utterly different
- this could be dangerous as hackers could take over powerful AI systems, e.g., with agents that have adversarial policies
hampered transfer of learning in DNNs
- reliance on shortcut learning hinders the ability to generalize knowledge from one task or domain to another
- we see this in the way that AIs can be made to lose games by adding one or two random pixels to the screen
do DNNs represent the world the same way we do?
- no, they see but they do not perceive. this means that they do not understand the world in the way we do
- they are brittle
- they use shortcut learning
learning levels of CNNs
- CNNs learn at superficial levels, not abstract levels
- this explains the limitation that they excel at recognizing surface-level patterns like color and texture, but struggle with understanding abstract concepts like sameness and relationships between objects
human vs AI number of training examples
- DNNs need more training trials.
- this allows them to perform very specific tasks with little transfer of learning
- their performance is very much constrained by the training material (within-distribution)
ethical issue of AI
- if the training data is not representative of the larger world, algorithms can produce biased outputs
problem with lack of understanding/generalizability of models
- AI models - especially in healthcare - often fail to generalize beyond the data they were trained on
- this lack of generalizability points to the fact that these AI systems do not truly understand the task at hand
use of AI in cognitive neuroscience: goal of computational neuroscience
to find mechanistic explanations of how the nervous system processes information to give rise to cognitive function and behavior
use of AI in cognitive neuroscience: biological plausibility of models
- DNNs abstract away too much from biological reality to be of use as models for neuroscience
- e.g., some DNNs have 150 layers, the ventral stream doesnt
- DNNs are often feed-forward. what about recurrent processing.
- these models are built to optimize performance, rather than psychological or biological plausibility
-
they are still a black box
- DNNs merely exchange one impenetrably complex system with another
how important is biological plausibility at the implementation level
- from an engineering standpoint, the focus is on what works. the objective is to build systems that perform tasks effectively and efficiently, regardless of whether they mimic biological processes
- therefore, biological plausibility should be a guide, not a strict requirement.
- i.e., how closely AI mimics human brain function can serve as a guide in the design of systems.
do NNs learn the same way babies do
- no
- they need more training trials
- training allows them to perform very specific tasks with little transfer of learning, so their performance is constrained by the training material.
- they learn at superficial levels, but struggle with understanding abstract concepts
marr’s three levels of analysis
- framework for understanding how systems (both biological brains and AI) process information
- computation - why (problem):
- focuses on what problem the system is trying to solve, so why this system exists and what the goal of computation is.
- in the bird analogy this is flight: the system needs to solve the problem of how to move through the air
- algorithm - what (rules):
- examines what rules or processes are used to solve the problem. it explores specific algorithms or steps the system follows
- in the bird analogy this is the flapping of the wings and coordinated movements that enable flight
- implementation - how (physical):
- concerns the physical substrate or hardware that implements the system. i.e., how is this computation physically realized
- in the bird analogy this is refers to the feathers, muscles, and wings, that physically allow the bird to flap and fly
Marr’s main idea
- ‘trying to understand perception by understanding neurons is like trying to understand a bird’s flight by studying only feathers. it just cannot be done.’
- emphasizes that understanding complex systems requires more than looking at the physical implementation
- understanding perception and cognition requires studying the computation (the goal) and algorithm level (the processes involved).
top-down approach to modelling
- aims to capture cognitive functions at the algorithmic level
- disregards biological implementation so as to focus on decomposing the information processing underlying task performance into its algorithmic components
- higher cognitive fidelity, lower biological fidelity, as they focus on replicating behavior or cognitive tasks
bottom-up approach to modelling
- aims first to capture characteristics of biological neural networks, such as action potentials and interactions among multiple compartments of single neurons
- this approach disregards cognitive function so as to focus on understanding the emergent dynamics of a small part of the brain, such as cortical columns and areas, and to reproduce biological network phenomena, such as oscillations
- higher biological fidelity, lower cognitive fidelity because they aim to replicate brain physiology rather than high-level cognitive functions
synergy between top-down and bottom-up approach
- usually there is a trade-off between cognitive and biological fidelity
- this can turn into synergy when;
- cognitive constraints help clarify biological function
- biology inspires models that explain cognitive feats (i.e., improving cognitive models)
- this synergy aims to answer the common goal of explaining how the brain gives rise to the mind
artificial neural networks (ANNs)
computational models inspired by the structure and function of biological networks to model behavioral and neural data