Object Recognition Flashcards
Ventral visual processing stream
Processes information related to object recognition
Inferotemporal cortex:
PIT: Posterior Inferotemporal cortex
CIT: Central Inferotemporal cortex
AIT: Anterior Inferotemporal cortex
- They interact back and forth to aid perception
- Expectations can influence what you see (cognitive processes can influence what you percieve)
?!? Posterior Inferotemporal Cortex
Responds to relatively simple stimuli
Cells further along the ventral stream respond to more complex and specific stimuli
Receptive feilds
Area of space that a cell is sensitive to
Becomes larger as we move further along to ventral stream
A larger receptive field allows an object to be identified regardless of size or location
Visual Agnosia
Inability to recognize objects in the visual modality
- Inability to recognize the object
- Apperceptive and Associative
Apperceptive Agnosia
Cannot form a mental impression of something
Sensory information is processed, but the brain cannot put the details together in a meaningful way
- Caused by damage to the occipital lobe and associated areas (more visual regions, beginning of ventral stream)
Associative Agnosia
Basic visual information can be integrated and put together to form a meaningful whole
That particular whole cannot be linked to stored knowledge
- Caused by damage to the occipitotemporal regions of both hemispheres and subadjacent white matter
- damage occurs further along down the ventral stream compared to apperceptive agnosia
Prosopagnosia
Agnosia specific to faces
- “selective inability to differentiate between or recognize different faces”
- Damage to the ventral stream in the right hemisphere
–> individuals can determine basic information (i.e. “this is a face”)
- Developmental (congenital) prosopagnosia is thought to be present in ~2% of the general populations
- The anterior temporal lobe (near the end of the ventral stream) is shown to not be as activated by images of faces
Word vs. Face recognition
Faces: right side
Words: left side
Sparse Coding
Theory that a small, specific group of cells responds to the presence of a given object/ stimuli
Grandmother Cell theory
Extreme version of sparse coding
- the idea that there is a specific cell in the brain that, for example, fires only when your grandmother is present
- thought to be incorrect
Population Coding
Theory that the pattern of activity across a large population of cells codes of individual objects
Extreme: every cell fires for every object
How do we explain cell function in object recognition using theories?
Likely something between sparse coding and population coding
Invariance in recognition
Humans can identify objects under many different conditions
2 ideas:
- Form-cue invariance- the brain’s categorization is constant regardless of the form of the cue that represents the object
- Perceptual constancy: the ability to recognize objects from different angles, lighting, size, position, etc.
Suggest that at some level, our mental representation of objects is abstract/ conceptual
Lateral Occipital Complex (LOC)
Involved in form-cue variance and perceptual constancy
May be the stage in visual processing where
abstract shape representations are formed.
- Supports recognition of objects despite differing
conditions
Are neural representations viewpoint-dependent or viewpoint-independent?
There seems to be some amount of viewpoint dependency
- can recognize objects regardless of viewpoint, but more familiar viewpoints cause faster and more accurate recognition
It’s a bit of both
» some ventral stream cells change depending on orientation, and others respond the same regardless of angle
> recognition likely depends on on comparison with stored abstracts in the brain
Viewpoint dependence takes place in the earlier stages of the ventral stream and viewpoint independence in further stages of the ventral stream
Hemispheric Differences in Processing
Distinct Neural mechanisms involved in object recognition:
- left ventral stream: parts of the object
- right ventral stream: analyzing whole forms/ configuration of parts
Inversion Effect
Recognition is poorer when view upside-down
» suggests we rely on how features are put together/ related to each other
»»> especially important for categories with which we have more expertise
Nonlocal binding
Model for how shapes are recognizing
- whole object is represented by the co-activation of cells that represent the parts of an object in different locations
- no separate unit represents the whole; the whole is perceived when these units are all firing together
Conjunctive Encoding
Model for how whole shapes are recognized
- Lower-level regions represent features and send output to higher-level regions representing the shapes that result from the joining of those features
- separate unit represents the whole, beyond just the ones that represent the parts
Evidence leans towards this model
Object recognition based on tactile and auditory modalities
Humans can also recognize objects by touch and sound
- Similarities in brain organization emerge across modalities:
»>Early cortical areas code basic features
»>Higher-level areas organize sensations into representations
of recognizable objects
Auditory Agnosia
The inability to recognize the meaning of sounds
Somatosensory agnosia/ tactile agnosia
Inability to recognize an object by touch
(may not be necessary) What vs. Where across modalities
All modalities across all senses must distinguish between what and where
There is a distinction between “what” and “where” pathways in
the brain that seems to be a basic organizational feature for all
modalities.
»> ventral (what) and dorsal (where) streams for the visual system