lecture 6 - human-like AI Flashcards

1
Q

deep learning

A

allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

backpropagation

A

indicates how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

deep convolutional network

A
  • filtering in a CNN serves the purpose of extracting meaningful features from the input data
  • convolution of the filters with the input enables the network to reduce dimensionality, achieve spatial invariance, and capture increasingly compex features as the network deepens
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

parallels of CNN architecture with the visual system

A
  1. increasing RF size going up the model hierarchy
  • simple cells: respond to specific orientations of stimuli in preferred image locations, reacting strongly to bars and edges
  • complex cells: integrate input from multiple simple cells, enabling broader spatial response and feature detection across the image
  1. mimicks the representational geometry of the inferior temporal cortex: CNN develops a similar representational stucture to the IT, which suggest deeper layers of DNNs can approximate the human brain’s representational geometry for visual processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

biologically inspired ML systems

A
  • they approach ‘human level’ accuracy in a variety of domains, and even predict human brain activity.
  • this raises the possibility that such systems represent the world like we do
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

major flaws of DNNs (Heaven et al)

A
  1. brittleness:
    - they are highly sensitive to changes in input data, which can cause them to make mistakes that humans would not typically make.
  • changes such as added noise or rotation of a learned object
  • this makes them vulnerable to attacks, as demonstrated by adversarial examples
  1. shortcut learning:
    - DNNs overrely on pattern recognition, but lack deeper understanding of the world.
  • they focus on incidental features, such as texture, background, or locations of objects, rather than the key attributes humans would learn.
  • they have no good model of how to pick out what matters, so an irrelevant change in feature input can throw it off.
  • this hampers transfer of learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

adversarial examples

A
  • minimal noise added to input data can lead to significant errors, such as misreading a stop sign as a speed limit sign
  • humans would see the two images as identical, whereas DNNs see them as utterly different
  • this could be dangerous as hackers could take over powerful AI systems, e.g., with agents that have adversarial policies
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

hampered transfer of learning in DNNs

A
  • reliance on shortcut learning hinders the ability to generalize knowledge from one task or domain to another
  • we see this in the way that AIs can be made to lose games by adding one or two random pixels to the screen
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

do DNNs represent the world the same way we do?

A
  • no, they see but they do not perceive. this means that they do not understand the world in the way we do
  1. they are brittle
  2. they use shortcut learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

learning levels of CNNs

A
  • CNNs learn at superficial levels, not abstract levels
  • this explains the limitation that they excel at recognizing surface-level patterns like color and texture, but struggle with understanding abstract concepts like sameness and relationships between objects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

human vs AI number of training examples

A
  • DNNs need more training trials.
  • this allows them to perform very specific tasks with little transfer of learning
  • their performance is very much constrained by the training material (within-distribution)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ethical issue of AI

A
  • if the training data is not representative of the larger world, algorithms can produce biased outputs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

problem with lack of understanding/generalizability of models

A
  • AI models - especially in healthcare - often fail to generalize beyond the data they were trained on
  • this lack of generalizability points to the fact that these AI systems do not truly understand the task at hand
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

use of AI in cognitive neuroscience: goal of computational neuroscience

A

to find mechanistic explanations of how the nervous system processes information to give rise to cognitive function and behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

use of AI in cognitive neuroscience: biological plausibility of models

A
  1. DNNs abstract away too much from biological reality to be of use as models for neuroscience
  • e.g., some DNNs have 150 layers, the ventral stream doesnt
  • DNNs are often feed-forward. what about recurrent processing.
  1. these models are built to optimize performance, rather than psychological or biological plausibility
  2. they are still a black box
    - DNNs merely exchange one impenetrably complex system with another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how important is biological plausibility at the implementation level

A
  • from an engineering standpoint, the focus is on what works. the objective is to build systems that perform tasks effectively and efficiently, regardless of whether they mimic biological processes
  • therefore, biological plausibility should be a guide, not a strict requirement.
  • i.e., how closely AI mimics human brain function can serve as a guide in the design of systems.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

do NNs learn the same way babies do

A
  • no
  1. they need more training trials
  2. training allows them to perform very specific tasks with little transfer of learning, so their performance is constrained by the training material.
  3. they learn at superficial levels, but struggle with understanding abstract concepts
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

marr’s three levels of analysis

A
  • framework for understanding how systems (both biological brains and AI) process information
  1. computation - why (problem):
  • focuses on what problem the system is trying to solve, so why this system exists and what the goal of computation is.
  • in the bird analogy this is flight: the system needs to solve the problem of how to move through the air
  1. algorithm - what (rules):
  • examines what rules or processes are used to solve the problem. it explores specific algorithms or steps the system follows
  • in the bird analogy this is the flapping of the wings and coordinated movements that enable flight
  1. implementation - how (physical):
  • concerns the physical substrate or hardware that implements the system. i.e., how is this computation physically realized
  • in the bird analogy this is refers to the feathers, muscles, and wings, that physically allow the bird to flap and fly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Marr’s main idea

A
  • ‘trying to understand perception by understanding neurons is like trying to understand a bird’s flight by studying only feathers. it just cannot be done.’
  • emphasizes that understanding complex systems requires more than looking at the physical implementation
  • understanding perception and cognition requires studying the computation (the goal) and algorithm level (the processes involved).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

top-down approach to modelling

A
  • aims to capture cognitive functions at the algorithmic level
  • disregards biological implementation so as to focus on decomposing the information processing underlying task performance into its algorithmic components
  • higher cognitive fidelity, lower biological fidelity, as they focus on replicating behavior or cognitive tasks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

bottom-up approach to modelling

A
  • aims first to capture characteristics of biological neural networks, such as action potentials and interactions among multiple compartments of single neurons
  • this approach disregards cognitive function so as to focus on understanding the emergent dynamics of a small part of the brain, such as cortical columns and areas, and to reproduce biological network phenomena, such as oscillations
  • higher biological fidelity, lower cognitive fidelity because they aim to replicate brain physiology rather than high-level cognitive functions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

synergy between top-down and bottom-up approach

A
  • usually there is a trade-off between cognitive and biological fidelity
  • this can turn into synergy when;
  1. cognitive constraints help clarify biological function
  2. biology inspires models that explain cognitive feats (i.e., improving cognitive models)
  • this synergy aims to answer the common goal of explaining how the brain gives rise to the mind
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

artificial neural networks (ANNs)

A

computational models inspired by the structure and function of biological networks to model behavioral and neural data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

neuroconnectionnism

A
  • seeks to align ANNs with biological principles to create better tools for understanding how the brain processes information
  • i.e., using ANNs inspired by biology to model behavioral and neural data
25
Q

criticisms of neuroconnectionism

A
  • ANNs inspired by biology to model behavioral and neural data fail to account for basic cognitive functions
  • this means that they do not fully capture the biological or cognitive complexity of real brains
  • however, arguing about the successes and failures of current ANNs is not the right approach for evaluating their potential in neuroscience. the success of a scientific program should not be judged on individual results but on its capacity to generate new insights and understanding over time (i.e., falsifiable theories about brain computation)
26
Q

how DNNs can generate novel predictions about brains

A
  • different brain regions represent different visual categories that can be represented as clusters that correspond to specific categories
  • DNNs can be used for image classification where clusters correspond to specific categories (e.g., bodies, faces, etc.)
  • this demonstrates how DNNs can be used to predict and simulate how the brain processes various visual categories
  • this helps identify novel brain regions that may play previously unknown roles in processing visual stimuli, demonstrating how ML models can inspire novel biological predictions.
27
Q

are brains necessary to understand cognition

A
  • brains can provide a blueprint for cognitive models, but exact replication of them is not necessary if AI can replicate outcomes.
  • it is important to find synergy between top-down and bottom-up approaches to modelling to find a balance between good cognitive and biological fidelity so we can explain cognitive processes
28
Q

dark solutions

A
  • paradigm in AI that goes beyond deep learning methods
  • key idea perception involves more than just visible information (like pixels in an image), but that it builds on latent factors - referred to as ‘dark matter’ beyond the visible
  • this paradigm therefore suggests that AI needs to move beyond simply recognizing what is seen (what & where) and start addressing more complex questions like ‘why’ and ‘how’ things happen, which are key components of human-like common sense
  • goal: to develop AI that understands underlying, invisible elements of perception and can reason about the world in a more human-like, intuitive manner
29
Q

how can we simulate human intelligence best in artificial systems?

A
  1. embedding hard-code rules in DNNs
  2. more variable training in richer 3D environments
  3. learning from less data
  4. supplementing basic pattern-matching with reasoning abilities
  • also, active inference in AI models
30
Q

DL improvement: embedding hard-code rules in DNNs

A
  • this is a hybrid approach that combines deep learning using learned data with symbolic AI (explicit, predefined rules)
  • this uses top-down learning to advance AI. it uses built-in pre-existing knowledge, rather than bottom-up, data-driven approaches.
  • this way, more stuctured knwoledge is integrated into AI systems that can help understand and reason about the physical world more like humans do
31
Q

DL improvement: more variable training in richer 3D environments

A
  • training AI systems in diverse, complex 3D environments where they can interact and explore
  • allows them to learn more about actions and their consequences
  • this improves their ability to understand and respond to the world
32
Q

DL improvement: learning from less data

A
  • instead of overwriting what has been learned, we need to develop systems that can retain learned knowledge, while efficiently learning new information
  • perceptual learning, in both humans and AI, benefits significantly from prior knowledge and is central for robust perception
  • current NNs perform similar to young children, which struggle more with complex recognition tasks due to local biases
33
Q

DL improvement: supplementing basic pattern-matching with reasoning abilities

A
  • give DNNs the ability to create their own algorithms for reasoning
  • this moves beyond just recognizing patterns to actually reasoning through problems
  • this would bring AI systems closer to human-like intelligence
34
Q

active inference in AI vs. passive AI models

A
  • active inference suggests that the brain acts as a generative model, where sensory data is used for both recognition and controlling & predicting the consequences of actions
  • unlike passive models, active inference involve agents interacting with their environment, constantly testing predictions against real-world experiences
  • this approach is essential for creating AI systems that can understand and act in the world, rather than just processing sensory input without feedback from actions
35
Q

LLMs: transformer model

A
  1. embedding layer: converts sequence of words into numeric pattern
  2. attention layer: computes interactions between words to capture their relationships and context
  3. traditional NN: produces an output that represents a new pattern of numbers, corresponding to the “meaning” of the input.
  • the embedding layer and attention layer form the transformer block
  • inside LLMs there are multiple transformer blocks
36
Q

what does a transformer block do

A

each block processes the input layers to refine the understanding of various aspects of ‘meaning’ to build a deeper contextual understanding

37
Q

final output of the transformer network

A
  • a probability distribution over a large vocabulary of potential next words (around 50k words)
  • the model calculates the probability of each word being the next in the sequence based on the context it has learned
38
Q

can LLMs display human like intelligence - arguments for

A
  • they can reason, develop theory of mind, and show early signs of intelligence akin to the cognitive abilities of a 6 year old
  • with further scaling, LLMs can become superintelligent, or even a potential threat to humanity
39
Q

can LLMs display human like intelligence - arguments against

A
  • current conclusions are premature, as LLMs are fundamentally just next token predictors without true understanding
  • i.e., they are cultural technologies that statistically summarize human knowledge found on the internet
  • example: chatGPT makes mistakes when asked about numbers without the letter “e,” indicating it doesn’t fully understand the task but merely follows patterns in the data
  • example: BLINK benchmark tests multimodal LLMs (language + vision models) on 14 visual perception tasks. Human accuracy is nearly 96%, but even the best LLMs perform only slightly better than random guessing.
40
Q

Mahowald et al.: dissociate language from thought in LLMs

A
  • suggests that while LLMs can produce coherent language, their internal processes might lack genuine understanding or cognition
  • language is therefore not the same as thought
41
Q

Mahowald et al.: formal vs. functional competence

A
  • a formal language system in isolation (like LLMs) is useless/limited if it doesnt integrate with cognitive functions beyond language, such as perception and action
  • the capacities required to use language to do things in the world are distinct from formal competence and depend crucially on non-linguistic cognition
42
Q

Mahowald et al.: LLMs

A
  • although LLMs are good at formal competence, their performance on functional competence tasks remains spotty
  • being good at language does not equate being good at thinking
43
Q

Mahowald et al.: human-like language in AI

A
  • would require models to master both formal and functional competence
  • may involve building modular systems with distinct processes for each time of competence, similar to how the human brain operates (separate mechanisms for language and cognition)
  • architectural modularity: building modularity into the architecture of the system
  • emergent modularity: naturally inducing modularity through the training process, both through training data and the objective function
44
Q

formal linguistic competence

A
  • knowledge of linguistic rules and patterns
  • i.e., getting the form of the language right to be able to produce and comprehend language
  • phonology, morphology, lexical semantics, syntaxt
45
Q

functional linguistic competence

A
  • understanding and using language to do things in real-world contexts
  • formal reasoning, world knowledge, situation modeling, social reasoning
  • LLMs struggle with this
46
Q

formal reasoning

A
  • logic, math, planning
  • ex: fourteen birds were sitting on a tree. three left, one joined. there are now eleven birds
47
Q

world knowledge

A
  • facts, concepts, common sense
  • ex: the trophy did not fit in the suitcase because the trophy was too small
48
Q

situation modeling

A
  • e.g., discourse coherence, narrative structure
  • ex: sally doesnt own a dog. the dog is black
49
Q

social reasoning

A
  • e.g., pragmatics, theory of mind
  • lu put the toy in the box and left. bo secretly moved it to the closet. lu now thinks the toy is in the closet
50
Q

better tests for LLMs

A
  • current tests are designed for humans, which may not accurately test the decision-making, reasoning, or cognitive abilities of LLMs
  • simulation is not the same as actual understanding or cogntive instantiation
  • we need better tests to evaluate cognitive abilities of AI models
51
Q

fallacies to watch out for

A
  1. narrow intelligence is on a continuum with general intelligence
  2. easy things are easy, and hard things are hard
  3. the lure of wishful mnemonics
  4. intelligence is all in the brain
52
Q

fallacy: narrow intelligence is on a continuum with general intelligence

A
  • misconception that narrow AI will eventually evolve into general AI whe scaled up
  • though complexity of narrow models can be increased, the diversity remains anrrow
  • so they improve in their specific task performance, but are confined to their narrow domains without transfer of knowledge to different tasks
53
Q

narrow AI

A
  • application specific/task limited
  • fixed domain models provided by programmers
  • learns from thousands of labeled examples
  • reflexive tasks with no understanding
  • knowledge does not transfer to other domains or tasks
  • today’s AI
54
Q

general AI

A
  • performs general (human) intelligent action
  • self-learns and reasoning with its operating environment
  • learns from few examples and/or from unstructured data
  • full range of human cognitive abilities
  • leverages knowledge transfer to new domains and tasks
  • possibly the future of AI
55
Q

what is intelligence

A
  • has transfer learning and the ability to apply learned skills to new, unseen problems at its core
  • increasing the size of models alone therefore won’t lead to AGI
56
Q

fallacy: easy things are easy and hard things are hard

A

moravec’s paradox: tasks requiring high-level reasoning are easy for computers, while tasks that seem simple for humans are difficult for machines to replicate

57
Q

fallacy: the lure if wishful mnemonics

A
  • oversimplifying AI concepts by giving them wishful/convenient names
  • can lead to misunderstandings or exaggerated expectations of AI capabilities, and anthropomorphizing technology
58
Q

fallacy: intelligence is all in the brain

A
  • assumes that intelligence is purely a product of the brain’s raw computing power and that by simply increasing computational resources, machines will become as intelligent as humans
  • this oversimplifies the complexity of human intelligence, which is not only about computational power but also involves interaction with the environment, embodied cognition, and other factors beyond just brain activity.