Chapter 4b: Object Recognition Flashcards
Poop Farts Picture
WHAT do you see in this image?
At the level of the retina, you “see” an array of point-lights bouncing off the page and exciting your rods and cones.
In early visual brain areas, you “see” a collection of oriented lines and a collage of red, green, yellow, and blue color patches.
But your response to this question was almost certainly not “light” or “lines” or “colors”; what we all perceive in this scene are “toys.”
The ability to organize visual sensations into coherent objects and then assign meaningful category labels to these objects is in many ways the ultimate accomplishment of vision.
The problem of object recognition:
What do you see?
Picture 1: Picture of the front of a red house.
Picture 2: Low-key abstract Watercolor painting of a house.
Picture 3: Picture of the same house from picture #1, but of the side of the house.
The problem of object recognition:
The pictures were just a bunch of pixels on a screen, but in each case you perceived a house
How did you recognize all 3 images as depicting a house?
How did you recognize the 1st and 3rd images as depicting the same house, but from different viewpoints?
How does your visual system move from points of light, like pixels, to whole entities in the world, like houses?
Processes in object recognition:
- Determine features present in image
(“Low-level vision”) - Group features into objects
(“Middle vision”) - Match perceived representations to encoded representations
(“High-level vision”)
Object Recognition Challenges
How do we match a sensation to a memory ?
How is it possible to recognize objects from different vantage points when their optical projections can vary so dramatically?
*Pictures of tea points from different angles
Naïve template theory
Object Recognition Theory
The proposal that the visual system recognizes objects by matching the neural representation of the image with a stored representation of the same “shape” in the brain.
That is, maintain a memory of many different views for each object we need to recognize.
“Pandemonium” Oliver Selfridge (1959)
“Lock-and-key” representations: bar codes
Problem: You would need too many templates!
* Example of all the different A fonts.
Very many templates would be required to recognize the different ways that the letter A can be represented.
Describe this object
When asked to describe a novel object, observers typically do so by identifying different parts.
Structural description theory
Object Recognition Theory
A description of an object in terms of the nature of its constituent parts and the relationships between those parts.
I.E. exploit those properties that can distinguish most objects from one another, yet remain relatively stable over changes in view.
“Generalized Cones” David Marr (1977)
“Recognition-by-Components” Biederman (1987)
Object recognition by components
Biederman (1987)
Objects are defined as configurations of qualitatively distinct parts called Geons.
Geons are defined by configurations of non-accidental properties.
Geons
configurations of qualitatively distinct parts
defined by configurations of non-accidental properties.
Each type of geon is defined by a particular configuration of non-accidental properties.
(cone, cylinder, block, etc.)
Geons are distinguished by their non-accidental properties
the number of straight and curved edges
which edges are parallel to one another
the number of vertices of each type
the presence of symmetries
Meaning in the Edges
Non-accidental features provide clues to object structure
T junctions mostly signal OCCLUSION (One object in front of another)
Y and ARROW junctions signal a corners (and not occlusion) most of the time.
These rules FAIL when viewing objects from ACCIDENTAL VIEWPOINTS (that’s why we can have the wrong representation of an image).
Objects
Each type of object is defined by a particular configuration of geons.
Objects = Cup, telephone, suitcase, etc.
Geons = Cone, cylinder, block, etc.
Prediction Recognition by components:
Deletion of contours in an image should have the greatest effect on recognition performance if it masks non-accidental properties or geons.
Geon Theory Task
Subjects are presented with an intact or contour deleted object, and they are asked to name it as quickly as possible.
Recognition performance is more severely impaired by vertex deletion than by midsection deletion.
Recognition performance is more severely impaired by geon deletion than by midsection deletion.
Some evidence suggests that object recognition is only possible for viewpoints that are close to those that were observed during training,
This is the opposite of what Recognition by Components predicts.