Chapter 4: Recognizing Visual Objects Flashcards
Tasks involved in perceiving objects
-Determining the boundaries of an object
>Separating out which changes in luminance and colour are intrinsic to the object or to external lighting conditions (shadows)
-Potential ambiguity of an object seen from one viewpoint (variable views)
-Determining the “true” shape of an object when some of its contours are hidden (image clutter)
-Need to be able to recognize an incredibly large number of objects in the world (object variety)
Perceptual Organization
(lower level processes) Represent Edges Figure-ground (assign border ownership) Grouping Interpolation of missing edges
Object Recognition
(higher level process)
Match perceptual representation to those stored in memory
Edge detection
mexican mat: ganglion cell, small hat is all the increased spatial freq info and big hat is course small spac freq info
Border ownership:
with the ink edges are appropriately assigned, region in front ‘owns’ the border. If slipped don’t know what’s figure-ground. Amount of light in oral field identical but firing rate for light on dark is higher. Activation is V2 higher if presented with one circle with overlapping triangles then dif vs one circle then same one. (strong response to change in border ownership)
Grouping/ Synchronicity
By size proximity colour movement orientation symmetry, Single bar of light across a receptive field has high synch between neurons, two separate bars in same direction has weak sync and dif moving bars has no sync
Interpolation with bars
We precieve the long light bar passed over the object so high firing since we “fill in the edge”, but if blocked off low firing since we “see” nothing pass over bar
Prototype matching
(structural description theories)
Viewpoint invariant representation
Biederman (recognition by components), Marr
Based on primitives, called 3D “geons”
These are fundamental 3-dimensional shapes that are easily discriminable under a wide variety of conditions
Defines parts and the spatial relation between them
In-category distinctions difficult.
-We’d be a lot worse at object recognition if this were the case
Template matching
Viewpoint-dependent representation (if change orientation we suddenly suck) therefore okay but need lots of representations for each object (unless population code, then not more neurons…more patterns of neurons)
V4 characteristics
- gets input from all cells
- big receptive field
- increase complexity of what cells can respond to
- contours (roll that shape is playing in overall object)
- straight or curved
- prefered location at certain angular position
Inferotemporal Cortex
-respond to shapes (combo of contours)
-FFA is here
-increase receptive field (entire visual field)
increase complexity
-Includes a number of subdivisions/channels/modules in which neurons show greatest responses to particular classes of objects (FFA)
-modular orginization (however activation is also distributed throughout)
Heirarchical Coding
- Area lower down (like V2) location more important [NB for representation of the world]
- info hierarchical in ITC
- dif aspects of stim processed by dif anatomical areas but all must work together
- top more vauge (animate vs inanimate) snd lower is finer processing like (FFA or ob?), even finer (bottom would be is it eyes/lips of the face)