Lecture 3, Object Recognition Flashcards
What is object recognition?
Perception of objects is different for humans and computers.
For humans –> perception of familiar items
For computers –> perception of familiar patterns
Why is object recognition difficult?
- Environment contains hundreds of overlapping objects
- Objects have variability, e.g., translation invariance, rotation invariance, size invariance, colour
- Objects can vary in the visual scene, e.g., partial occlusions and presence of other objects
- Intra-class variation -> same object has different forms, e.g., different types of chairs
- Only part of object may be visible
- Viewpoint variation -> may be more difficult to recognise an object from one viewpoint to another
What are theories of 2D pattern matching?
Template theories, prototype theories, feature theories, structural descriptors
What are template theories?
Mini copy/template in LTM of all known patterns - compare stimuli to templates in memory for one with greatest overlap until a match is found
What are some problems with template theories?
- Imperfect matches
- Cannot account for the flexibility of pattern recognition system
- Comparison requires identical orientation, size, position of template to stimuli
What are prototype theories?
Modification of template matching (flexible templates) - possesses the average of each individual characteristic. No match is perfect –> criterion is missing
Is there any supporting evidence for prototype theories?
Documented by Franks and Bransford:
–> Presented objects based on prototypes
–> Prototype not shown
–> Ppts confident they had seen prototype
–> Suggests evidence of prototype
HOWEVER, difficult to conclude how you would come up with a prototype
What are feature theories?
Pattern consists a set of features or attributes, e.g., letter A = 2 straight lines and connecting bar. However, also need to know the relationship between features. e.g., / \ - = A?
What are structural descriptions?
Describe the nature of the components and the structural arrangement of these parts. Composition of object and how they are related together. e.g., Capital letter T = 2 parts; 1 horizontal, 1 vertical; vertical support horizontal; vertical bisects horizontal.
What is 3D object recognition?
Must interpret input to the visual system as coherent structures, segregated from one another and from background. Must be processed to give a description which can then be matched to the descriptors of visual objects stored in memory.
What is Marr & Nishihara’s theory of 3D object recognition?
Objects are comprised of cylinders; must specify relationship between cylinders (structural description). Expressed structural relationships by a hierarchical organisation of cylinders. Each cylinder has an axis and way in which others are joined are expressed as coordinates.
What are the limitations of Marr and Nishihara’s theory of 3D object recognition?
–> Difficult to think about how to break down all objects into a series of cylinders
–> May work better for biological entities
What is Bierderman’s theory of 3D object recognition?
Provided an alternative to Marr & Nishihara’s theory.
- Objects comprised of basic shapes
- GEONS -> Geometrical ions
- Approx 36 different shapes
- Viewpoint invariant theory
- Relationship between geons can be described structurally
What are some examples of the structural relationships set out by Biederman?
Relative size, verticality, centring, relative size of surfaces at join.
What did Biederman believe concave parts of objects were helpful for?
Segmenting visual image into parts