Cognition Flashcards
features of interactive activation models
- Hierarchical representation units
- Sensory features
- Segments (Letters / Phonemes)
- Words
- Interactive activation
- Evidence that these models can do optimal Bayesian inference.
- Hand-wired models. No learning.
inference problem in generative models
Determine state of hidden variables given input.
Given a sensory input, what causal variables generated image?
learning problem for generative models
how to adjust the weights makes the hidden variables generate the observed sensory data
learning a generative model is learning a causal model of sensory input
transfer learning (from generative models)
- A generative model of sensory data should be able to transfer to other tasks.
- Learns robust features that cn be used elsewhere.
- Recognition models don’t support this kind of transfer because they were trained for labelling and discrimination.
claim by Zorzi, Testolin, and Stoianov (2013)
- Deep generative models add learning dimension to interactive activation models, so we get
- Hierarchical representations
- Top down and bottom up information
- Structured probabalistic cognition
- Learning
- Bridge gap between process-level PDP theory and problem-level structured Bayesian theory
basic structure of a deep generative network
- Input layer
- Some hidden layers that compress observed data of progressively more abstract feature detectors
- Large hidden layer on top
- Unfold/unravel/unravel compressed hidden representations into abstract classes and categories
Hinton’s cognitive connections on RBMs (2007)
- How might sleep-wake algorithm be implemented cortically?
- RBMs don’t have lateral connections. How might models be augmented to capture lateral inhibition?
- Deep hidden units are still kind of a black box, but at least with generative models, we can study what kinds of data are generated by certain hidden features.
- Top-down/bottom-up are plausible because:
- Some cortical regions reciprocally connected
- Hallucinations, dreaming, top-down disambiguation
advantages of connectionism over symbolic AI
- Context sensitivity
- Content sensitivity
- Quasiregularity
- Gradual learning
- Learnability
- Graceful degradation
- Biological inspiration
Context Sensitivity (PDP vs Old AI)
- Outcomes constrained by multiple sources of information
- Modularity doesn’t allow for contextual processing effects (word superiority effect)
Content sensitivity (PDP vs Old AI)
- Semantic content can support processing.
- The birdwatcher saw the bird with binoculars.
- The bird saw the birdwatcher with binoculars.
- But we still like structure too and can parse meaningless sentences.
Quasiregularity (PDP vs Old AI)
- Rule systems always have exceptions.
- But exceptions have some of the main regularities.
- E.g., past tense exceptions will end with d/t.
- Hard to draw a line between regular and irregular patterns.
- Want to be able to take advantage of varying degrees of regularity.
Gradual learning (PDP vs Old AI)
- Rule-learning predicts discontinuous development. A-ha moments.
- Cognitive development is not so sudden or abrupt.
- Periods are relative stability give way to “unstable, probabilistic, and graded patterns of change”.
- Use of rule might be influenced by frequency or regularity.
Learnability (PDP vs Old AI)
- How can we have innate knowledge about rules for reading when we didn’t evolve as readers?
- We see more tendencies than universals.
Graceful degradation (PDP vs Old AI)
- Deficits from TBI in a skill are graded, probabilistic.
- Performance may be sensitivity to frequency, familiarity, or regularity.
Biological inspiration (PDP vs Old AI)
- The implementation-doesn’t-matter argument holds for computers, but brains are different.
- I would say, need to allow for leaky abstractions
- Implementation details (neurons, etc) leaks into higher levels of abstraction