object recognition Flashcards
what is object recognition
perception of familiar items (memory)
why is object recognition so difficult
environments contain hundreds of overlapping objects
yet perceptual experience is of structured, coherent object which we can recognise, use and usually name
apparent size and shape of object does not change despite large variations in retinal image
-our understanding of the object doesn’t change
examples of object variability
translation invariance
rotation invariance
size invariance (could be bigger/smaller or closer/further)
colour
partial occlusion and presence of other objects
intra-class variation
all chairs vary widely, but we still know that they are all chairs
we can recognise them when only part of an object is visible
viewpoint variation
we can recognise an object from many different view points
objects may be easier to recognise from some view points
template theory
mini copy or template in long term memory of all known patterns
multiple templates are held in memory
compare stimuli to templates in memory for one with greatest overlap until a match is found
good match to template = recognition
questions about template theory
normalisation?
numerous templates?
what type of template?
how does this work for complex patterns?
how good a match is good enough?
what if an object has no template match?
critique of the template theory
problem of imperfect matches
cannot account for the flexibility of pattern recognition system
comparison requires identical orientation, size, position of template to stimuli
prototype theories
modification of the template matching theory (flexible templates)
possess average of each individual characteristic (prototype)
no match is perfect; a criterion for matching is needed
Frank and Bransford 1971
evidence for the prototype theory
presented items based on prototypes
prototype not shown
yet participants were confident that they had seen the prototype
suggests the existence of prototypes
however prototypes cannot account for all objects/ patterns
feature theories
pattern consists of a set of features or attributes
need to know the relationship between features
structural distribution
describe the nature of components of a configuration and the structural arrangement of these parts
2D pattern matching theories
template theories
prototype theories
feature theories
structural description
basics steps of 3D object recognition
early image processing: must first interpret input to the visual system as coherent structures, segregates from one another and from the background
then must be processes to give a description- which can then be matched to the descriptions of visual objects stored in memory
three questions for object recognition
what elements are used in the description? (primitives)
how is the relationship between these elements specified?
how is the overall description invariant across views?
Marr and Nishihara (1978)
objects are made of cylinders, must specify relationship between cylinders to give a structural description
expressed structural relations by a hierarchal organisation of cylinders
each cylinder has and axis and way in which other are joined are expressed as coordinates
the description of each cylinder is describes relative to its own axis, resulting in a description which is invariant across viewpoints
Biederman (1987, 1989) recognition by components theory
objects comprised of basic shapes
geons = geometrical ions
-blocks, cylinders, arcs, wedges
-approximately 36 different voluntary shapes
-viewpoint invariant theory
relationship between geoms allows for object identification
small number of structural relationships
concave parts of an objects contour helpful in segmenting visual image into parts
geons specified in terms of ‘non-accidental’ properties
regularities in the visual image thought to reflect actual (non-accidental) regularities in the world
according to model, forms of degradation which disrupts the basis for identifying geons should make the object more difficult to recognise
Biederman structural relationships
relative size
verticality
centring
relative size of surfaces at joint
non-accidental properties of geons
curvature - points on a curve
parallel - set of points in parallel
co-termination - edges terminating in a common point
symmetry - versus asymmetry
co-linearity - points in a straight line
biederman (1987)
slow an inaccurate at ‘non recognisable’ stimulus, but relatively good at recognisable
points of concavity more easy to recognise
deletion of component affects matching stage - reducing the number of components to match to
-at brief exposures (65ms) partial objects better recognised
midsegment deletion makes it more difficult to determine components
-at longer exposures (200ms) midsegment deletion led to less errors
biederman support
vogels, biederman, bar and Lorincz (2001) found some cortical neurons in monkeys sensitive to geons
assessed the response of neurons in the inferior temporal cortex to change in geon or change in size of object
some neurons responded more to geon changes, providing support for geons
evaluation of biederman
flexible and comprehensive system for describing objects, but why 36 geons?
experimental results consistent with model but doesn’t provide a critical test
doesn’t explain how descriptions are matches to those stored (how does object recognition occur?)
advantages of biederman
recognises the importance of the arrangements of the parts
parsimonious: small set of primitive shapes
-structural description of relationships emphasised
disadvantages of biederman
structure is not always the key to recognition
-texture not considered
which geons?
doesnt account for within category discrimination
de-emphasises the role played by context in object recognition
simplifies the contribution of viewpoint-dependence
-doesn’t explain why viewpoint may affect ease of recognition
viewpoint dependent theory
assume that changes in viewpoint reduce the speed and/or accuracy of object recognition
evidence suggests sometime viewpoint invariant mechanisms are used, and other times viewpoint dependent mechanisms are used
viewpoint dependent mechanisms are more important for within category discriminations
vanrie et al. (2002)
viewpoint dependent = complex within category decisions
viewpoint invariant = easy categorical decisions
Tarr and Hayward (2018)
object representations are neither viewpoint-dependent nor viewpoint-invariant
issues with object recognition theories
any theory of object recognition must address the binding problem = how do we integrate different kinds of information to produce object recognition?
when presented with several objects how do we decide which features belong to which objects?
beyond recognition
once a structural description of an object is formed, it must be matched to stored representations
if there is a match then object is ‘recognised’
evaluation Humphreys et al 1988
oversimplification? - as later processes may start before earlier ones have been completed? stages more integrated?
general support for model from patients with object recognition difficulties - associative agnosia