Weeks 3 & 4 - 'What' Pathway (Karen Lander) Flashcards
How do Humans perform object recognition?
Perception of familiar items between categories discrimination and within categories discrimination
What is between categories discrimination?
The difference between a table and a chair (belong in different object categories)
What is within category discrimination (intra-class variation)?
The difference between 2 different chairs (they belong to the same object category)
How do computers perform object recognition?
Perception of familiar patterns
Why is object recognition difficult?
- The environment contains hundreds of overlapping objects
- Yes perceptual experience is of structured, coherent objects which we can recognise, use and usually name
- Apparent size and shape of an object does not change despite large variations in retinal image
What is variablility in the environment which makes object recognition difficult?
- There is a field of view but the object can appear in different areas
- Can appear as differernt rotations
- Can appear as different sizes
- Can appear as different colours (when other tones overlap)
- The presence of other objects can occlude the object of focus
What is viewpoint variation as a challange for recognition?
- Objects can be viewed in different angles and viewpoints
- However, we can still normally recognise the object
What is the template theory for 2D pattern matching?
- We posess a mini template of a pattern (A for example) in the LTM
- We have variations of different patterns (B for example)
- We match the pattern to memory when we see something
- When the match is good enough, we have recognition (greatest overlap)
What are 3 issues of the template theory for 2D pattern recognition?
- Problem with imperfect matches
- Cannot account for the flexibility and complexity of the pattern recognition system
- Comparison requires identical orientation, size, position of template to stimuli
What is the prototype theory of 2D pattern matching?
- We do not have many different templates
- Instead, we have a stored average template for each individual characteristic
- We modify the templates for matching
- No match is perfect; a criterion for matching is needed
What is Frank and Bransford’s evidence for prototype theories of 2D pattern matching?
METHOD
- Present ppts with visual patterns
- These patterns were centred around a prototype
- The prototype was not shown
RESULTS
- When asked how confident they were that they had seen different patterns, they were certain they had seen the prototype before
- Suggests the existence of prototypes
What is the feature theory for 2D pattern matching?
- Patterns consist of a set of features or attributes
- For example: the letter A has two striaght lines and a connecting crossbar
What is structural descriptions as an explanation of 2D pattern matching?
- We need to describe the nature of the components of a configuration and the structural arrangement of these parts
- For example the letter A is similar to the letter H
- Therefore we need to know the relationship between the features rather than just the features themselves.
How is 3D object recognition similar to 2D pattern recognition?
- We must interpret input to the visual system as coherent structures
How is 3D object recognition different from 2D pattern recognition?
- We need to segregate the object from the background early in 3D
- Must be processed to give a description which can then be matched to the descriptions of visual objects stored in memory in 3D
What are 4 encompassing questions about object recognition?
- What features are used in the structural description (primitives)
- How is the relationship between these elements specified?
- How is the overall description invariant across views (descriptions same for different people)?
- What about viewpoint dependence?
What is Marr and Nishihara’s cylinder explanation of how people recognise 3D objects?
- Objects are comprised of cylinders and the relationship between the cyliniers is the structural description
- Structural relations of an object are explained by a hierarchical organisation of cylinders
- The position of each cylinder described relative to its own axis, results in a description which is invariant across viewpoints
What is an issue with Marr and Nishihara’s cylidner approach to explaining 3D object recognition?
- Doesn’t work well for everything
- For example a scrunched piece of paper cannot be broken down into cylinders
- We need a more flexible approach
What is Biederman’s recognition-by-components theory for 3D object recognition?
- Objects are composed of 36 basic shapes
- These shapes are called GEONS (geometrical ions)
- We can make any object as a combination of GEONS depending on how they are organised
- Breaking an object into a series of its constituent GEONs reveals what the structural description of the object is.
What type of theory is Bierman’s recognition-by-components theory?
Viewpoint invariant theory
What elements of object’s contour are helpful in segmenting visual images into parts as suggested by Biederman?
- Concave parts
- Areas of concavity are helpful for segregating GEONs
What are the non-accidental properties of GEONs which aid recognition of an object?
- Curvature- points on a curve
- Parallel- Set of points in parallel
- Co-termination- Edges terminating in a common point
- Symmetry- versus asymmetry
- Co-linearity- points in a straight line
What is an example of how non-accidental properties can aid recognition of an object?
- A cylinder posseses curved edges and two parallel edges connecting the curved edges
- Regularities in the visual image are thought to reflect actual (non-accidental) regularities in the world.
What is the chain of events that occur when detecting 3D objects according to Biederman? (4)
- Edge extraction (or contour extraction)
- Detection of non-accidental properties and parsing of regions of concavity
- Determination of components
- Matching of components to object representations
What, according to Biederman, what would a lack of make object identification harder?
Removal of edges at areas of concavity
What is some supporting evidence for Biederman’s model where edges were deleted in different areas?
METHOD
- Biederman deleted edges at points where its easily reinstated or difficult to determine
- Stimuli was presented for 100, 200 or 750 msec with 25%, 45% or 65% of contours removed
RESULTS
- When edges were removed at points of concavity, although the same amount was removed in a second image not at these points
- There was slow and inaccurate recognition at the concave points (non-recognisable)
- On the images where concavity remained there was relatively good recognition and the shapes were recognisable
What occurs when components are deleted from an image on the ability to recognise (Biederman)?
- Deletion of a component affects matching stage, reducing the number of components to match to
- Midsegment deletion also makes it more difficult to determine components
What are the results of the experiment conducted by Biederman on the removal of components versus midsegment elements?
- At breif exposures to the stimuli (65ms) partial objects are better recognised
- At longer exposures (200ms) midsegment deletion led to less errors
What is support of Biederman’s theory from Vogels et al?
- They found some cortical neurons in monkeys sensitive to GEONs
- They assessed response of individual neurons in the inferior temporal cortex to change in GEON or change in the size of an object
- Some neurons responded more to GEON changes, providing support for GEONs