Object Recognition Flashcards
• What is “object recognition”? • Neural mechanisms of object recognition • Computational challenges of object recognition • Failures in object recognition • Category‐specific recognition (Face recognition)
Object Recognition
What’s involved? (visual object recognition) – Intact vision (acuity) – Ability to perceive shapes (perception) – Discrimination – Identification – Recognition (memory) – Naming (language) – Object meaning/function (often motor)
• Where can it fail?
Object
Recognition
Deficits
• “Agnosia” = “failure of knowing”
• Visual agnosia = restricted to visual domain
– Can’t recognize objects visually
– Can still recognize objects using other senses (like touch)
– Visual acuity, memory, language, intellect intact
• (More on different types of agnosias later)
Brain areas important for Object Recognition:
Occipital Lobe Fusiform Gyrus (a face region) Parahippocampal Area Superior Temporal Sulcus (a face region) Posterior Parietal Lateral Occipital & Posterior Inferior Temporal Anterior Inferior Temporal
The two visual processing streams
Dorsal Pathway (Where) Ventral Pathway (What)
V1 and Posterier Parietal Cortex
Evidence
for
2
streams
• Monkey lesions: Trained monkeys on 2 tasks: Landmark task and Object task Lesioned parietal or temporal cortex; retested Ungerleider, Mishkin et al., 1982
• Human lesion patients
• Neuroimaging PET study Compare a & b: -Object Task (same) -Position Task (different)
The Dorsal and Ventral Streams
• “What” & “Where”
– How separate?
Where is information combined?
• Converge in frontal cortex
• Connections between streams
• Reality:
both “what” & “where” in both streams; the functions of the streams are not totally separate
• Other interpretations:
– “what” & “How”
– Identification vs action
Patient
D.F.
Lesion to ventral “what” stream
“ashtray”
“long, black, & thin”
Visual identification impaired, but visually‐guided
action intact.
Not a visual acuity deficit or a naming deficit
LOC Lesions
Lateral Occipital Cortex
LOC
There is greater fMRI activity in LOC for images of intact objects than for scrambled pictures of the same objects.
Pattern of fMRI acLvity in LOC can differentiate
between objects
Computational
challenges
• Variability in sensory information
(pink elephant, plaid apple)
– Object constancy (cars in different positions in space, differences in color due to shadows, knowing that something is there even if you can’t see it)
– Ambiguous information/bi-stable percepts
(like the picture in which people perceive either a old woman or a young woman)
- View-‐dependent or view‐invariant recognition?
- Other types of invariance (location, size, viewpoint, lighting, etc)
• Shape encoding \ binding of parts
– how?
(three lines composing a triangle or an arrow)
• How are objects represented at neural level?
–Hierarchy of Coding Hypothesis
– “Grandmother cells” vs ensemble coding
Ventral Pathway
The ventral stream begins with V1, goes through visual area V2, then through visual area V4, and to the inferior temporal cortex. The ventral stream, sometimes called the “What Pathway”, is associated with form recognition and object representation. It is also associated with storage of long-term memory.
Dorsal Pathway
The dorsal stream begins with V1, goes through Visual area V2, then to the dorsomedial area and Visual area MT (also known as V5) and to the posterior parietal cortex. The dorsal stream, sometimes called the “Where Pathway” or “How Pathway”, is associated with motion, representation of object locations, and control of the eyes and arms, especially when visual information is used to guide saccades or reaching.
Hierarchy of Coding Hypothesis
Features»_space;> Conjunction of Features»_space;> Component Shapes»_space;> Object
“Grandmother cells” vs ensemble coding
Ultra-‐selective hypothesis:
Single neuron for “your grandmother”
- What if that cell dies?
‐ How do we perceive novel objects?
- How does it adapt over time (e.g., as grandma gets old)?
- Do we have enough neurons to represent every single object we might encounter?
AlternaLve:
Ensemble coding
-Collective activation of many neurons
– 2 visual processing streams (what & where/how)
– Grandmother cells vs ensemble coding
?
What is the evidence for “what” informaFon being processed in the ventral stream?
?
What is and isn’t impaired in visual agnosia?
?