3 - Object Recognition Flashcards
What is Object Recognition for humans
For humans: perception of familiar items.
is perception of objects is different for humans than for computers
yes
what is object recognition for computers
For computers: perception of familiar patterns.
why is object recognition difficult
- Environment contains hundreds of overlapping objects.
- Yet perceptual experience is of structured, coherent objects which we can recognise, use and usually name.
• Apparent size & shape of an object does not change despite large variations in retinal
image.
give 6 examples of variability
translation invariance rotation invariance size invariance colour partical occlusion presence of other objects
what is a intra-class variation in object recognition
intra-class variation e.g. recognise chair - variation as to what a chair looks like
what is view point variation in object recognition
viewpoint variation
different viewpoints
recognition from unusual views
what is template theory an example of
2D pattern matching
what happens in template theory
mini copy or template in LTM of all known patterns normalisation? numerous templates? real life examples: bar codes, fingerprints compare against templates stored in LTM
see something new, determine what it looks like match with stored template
whats the problem with template theory
– Problem of imperfect matches
– Cannot account for the flexibility of pattern recognition system
– Comparison requires identical orientation, size, position of template to stimuli
looseness in argument
more compex - tricky
how mant templates? - fonts capitals small letters - where do we stop
what is template theory
– Multiple templates are held in memory
– Compare stimuli to templates in memory for one with greatest overlap until a match is found
what is prototype theory part of
2d pattern matching
what is prototype theory
- Modification of template matching (flexible templates)
- Possesses the average of each individual characteristic
- No match is perfect; a criterion for matching is needed
what is the evidence for prototype
Franks & Bransford (1971) – Presented objects based on prototypes – Prototype not shown – Yet participants are confident they had seen prototype – Suggests existence of prototypes
what is feature theory a part of
2d pattern matching
what is feature theory
- pattern consists of a set of features or attributes.
- A = 2 straight lines & connecting cross bar.
- But also need to know A
a relationship between features.
/ \ - = A??
define a letter a by features nd attributes that make a letter a
what is structural description an example of
2d pattern matching
what is structural description
“..describe the nature of the components of a configuration and the structural arrangement
of these parts” (Bruce & Green, 1990)
- Capital letter T = 2 parts; 1 horizontal ; 1 vertical; vertical supports horizontal; vertical bisects horizontal.
know features and relationship between fetrues
what are consistuent parts and how are they organised
what is 3D object recognition
Similar but more complex process………
• Firstly must interpret input to the visual system as coherent structures, segregated
from one another & from background. (Early image processing)
determine one object - create structural description - match to familiar obejcts strored in LTM
• Must be processed to give a description – which can then be matched to the
descriptions of visual objects stored in memory.
what happens in marrs computational approach
primal sketch
2 1/2 d sketch
3-d observation
what is a primal sketch
2-D description includes
changes in light intensity, edges, contours, blobs
what happens in 2 1/2 D sketch
Includes information about depth, motion, shading. Representation is observer-centered
what happens in 3-D representation
a representation of objects and their relationships
observer - independent
what are the 4 questions in object recognition
(1) What elements are used in the description? (primitives) - components features geons
(2) How is the relationship between these elements specified?
(3) How is the overall description invariant across views? recognise from different views
(4) What about viewpoint dependence? better at recognising in view points we usually see
what happens when processing objects comprise of cylinders
Objects comprised of cylinders - must specify relationship between cylinders = structural
description.
features and components
in a coordinate type system
what did Marr & Nishihara (1978) propose
Marr & Nishihara (1978) expressed structural relations by a hierarchical organisation of
cylinders.
- each cylinder has axis & way in which others are joined are expressed as coordinates.
how does the hierarchical organisation of cylinders work
The position of each cylinder described relative to its own axis, resulting in a description which is invariant across viewpoints.
each cylinder - in realtionship with main cylinder - work out coordinates to match what you have in LTM
what did Biederman suggest
Biederman (1987; 1989) provided alternative model to Marr & Nishihara (1978)…………..
• Biederman’s Recognition-by-components theory: Objects composed of basic shapes
- GEONS = ‘geometrical ions’
- blocks, cylinders, arcs, wedges
- approximately 36 different volumetric shapes
- Viewpoint invariant theory
what are GEONS
geometrical ions
components and features
what are the structural relationships according to biederman
relative size - bigger or smaller
verticality - one thing on top of another
centring - off set
relative size of surfaces at join - joined at short or long side
what is the viewpoint-invariant theory of recognition
recognition using:
3D component parts e.g. 36 GEONS
structural relations between the parts
relationship between geons
match whats stored in LTM
what does Biederman say about concave parts of objects
Concave parts of an object’s contour helpful in segmenting visual image into parts.
what are geons specifiend in terms of ‘non-accidental’ properties
Curvature - points on a curve Parallel - set of points in parallel Co-termination - edges terminating in a common point Symmetry - versus asymmetry Co-linearity - points in a straight line
what does Biederman say about cylinders
cylinder posses curved edges & two parallel edges connecting the curved edges.
properties that define a cylinder
what does Biederman say about regularities in the visual image
Regularities in the visual image thought to reflect actual (non-accidental) regularities in
the world. E.g. 2D symmetry in the visual image indicates symmetry in 3D object.
e.g. bucket anf mug - same consistent geons but different relationships
how do we see the object according to Biederman
edge extraction - see object
detection of non-accidental properties
parsing of regions of concavity- chop off one geon from another
determination of components
matching of components to object representations - create - match to stored descriptions in LTM
what makes objects more diffiuclt to recognise according to Biederman
According to model, forms of degradation which disrupt the basis for identifying geons should make objects more difficult to recognise….
if take away points of concavity - where one stopps and starts - makes genon more difficult to recognise
- Biederman (1987) deleted edges at points where easily reinstated or difficult to determine.
- Stimuli presented for 100, 200 or 750 msec with 25%, 45% or 65% contours removed.
what is curvature
points on a curve
what is parallel
set of points in parallel
what is co-termination
edges terminating to a common point
what is symmetry
versus asymmetry
what is co-linearity
points in a straight line
when Biederman deleted edges at points where easily reinstated or difficult to determine
slow and inaccurate at non-recognisable but relatively good at recognisable
when take away points of concavity - difficult to recognise as harder to break into constituent parts
biedermans stimuli
what is deletion and mid segment
Deletion of component affects matching stage -Reducing the number of components to match to.
Midsegment deletion makes it more difficult to determine components.
what did biederman find
At brief exposures (65ms) Partial objects better recognised.
At longer exposures (200ms) midsegment deletion led to less errors.
what is support for biedermans idea
Vogels, Biederman, Bar & Lorincz (2001) found some cortical neurons in monkeys sensitive to geons.
- Assessed response of individual neurons in the inferior temporal cortex to change in geon or change in size of object.
- Some neurons responded more to geon changes, providing support for geons.
what is the evaluation of biederman
flexible & comprehensive system for describing objects. But why 36 geons?
- Experimental results consistent with model but doesn’t provide critical test.
- Doesn’t explain how description are matched to those stored. - creating a structural description- need to make an adequate match in order to recognise
what are the advantages of biedermans model
Recognizes the importance of the arrangement of the parts
– Parsimonious: Small set of primitive shapes
what are the disadvantages of Biedermans model
– Structure is not always key to recognition: Peach vs. Nectarine - cant make decision based on edge info
– Which geons?
– Within category discrimination (which chair?)
– De-emphasise the role played by context in object recognition (affects later stages of object recognition) - different context - top down context info
– Simplifies the contribution of viewpoint-dependence - creating invariant descriptins - easier to recognise from familiar viewpoints
what theory did Biederman put forward
recognition by components theory
viewpoint dependent theory
‘object representations are collections of views that depict the appearance of objects from specific viewpoints’ (Tarr & Bulthoff, 1995)
what do view point invariance theories say
According to viewpoint invariance theories (Biederman, 1987) ease of object recognition
is not affected by the observer’s viewpoint.
what do viewpoint dependent theories say
viewpoint-dependent theories (Tarr, 1995; Tarr & Bulthoff, 1995; 1998) assume changes in viewpoint reduce the speed and / or accuracy of object recognition.
what does vidence suggest about viewpoint
Evidence suggests that viewpoint invariant mechanisms used sometimes in object recognition whereas viewpoint dependent mechanisms used at other times
when is view point dependent more important
Evidence suggests that viewpoint invariant mechanisms used sometimes in object recognition whereas viewpoint dependent mechanisms used at other times
what type of decisions is view point dependent used for
complex within category decisions
what type of decisions is view point invariant used for
Viewpoint invariant = easy categorical decisions
what did vanrie et al (2002) say
‘The key question is no longer if object recognition is viewpoint-dependent or viewpoint independent, but rather when, i.e. under which circumstances.’
what does tarr and hayward (2018) say
‘object representations are neither viewpoint-dependent nor viewpoint-invariant’
what is the binding problem
how do we integrate different kinds of information to produce object recognition?
what problem must any theory of recognition address
the binding problem
how are object recognised
Once structural description of an object is formed – it must be matched to stored representations.
- If there is a match then object is ‘recognised’.
- Several models which specify stages in ‘recognition’ process………
what did humphreys say about object recognition
structural description
semantic representation
name representation
what is structural description
face-recognise- match face if stuck here cant get through see objects but cant create structura description agnostic cant match so dont recognise
what is semantic representation
where know from
how do you use it
the context you see it
what is name representation
remember name
is there support for object recognition
General support for model from patients with object recognition difficulties. Associative agnosia
- for example Patient HJA; Patient JB
agnostic patients
unable to recognise objects
why is object regonsition an oversimplification
as ‘later’ processes may start before earlier ones have been completed.
not the case you get stuck - not necessarily all or nothing
which model did humphreys propose
the cascade model
what is the cascade model
structural, semantic and name stages interact.
- both within and between stages.
- Makes different predictions about how subjects will perform in object naming task.
Problems at one stage will have ‘knock on’ effect….
cascade model
structural description system
semantic representation
name representations
name
activations across and between stages
activations flowing around system
is there evidence for the cascade model
Anecdotal and empirical evidence for a separation of structural, semantic and naming
processes in recognition.
is there evidence for stages operating in cascade rather than independently
Humphreys et al (1988) propose processing across these stages operates in cascade
rather than independently.
– e.g. Patient JB. Naming visually confusable objects (birds, animals) had knock on effects, making it more difficult to identify their category.
what is agnosia
Failure of knowledge or recognition = “agnosia”. (visual agnosia)
what happens in visual agnosia
In visual agnosias, feature processing and memory remain intact, and recognition deficits are limited to the visual modality.
Alertness, attention, intelligence and language are unaffected.
Other sensory modalities (touch, smell) may substitute for vision in allowing objects to be recognized. - cant visually regonise but touch to determine
what are the tasks in BORB to test agnosia
(A) Association match; (B) Item match;
(C) Foreshortened match;
(D) Object decision;
(E) Minimal feature match
what is apperceptive agnosia
Apperceptive agnosia (lateral) – problems with early processing (shape extraction). vision - all objects look the same
what happens in apperceptive agnosia
Perceptual deficit, affects visual representations directly, components of visual percept are
picked up, but can’t be integrated, effects may be graded, often affected: unusual views
of objects
what is the diagnosis of apperceptive agnosia
ability to recognise degraded stimuli is impaired
what are the tests for apperceptive agnosia
-would have trouble drawing a chair with missing contours when presented to them
would have trouble recognising a chair from a certain perspective
what is associative agnosia (bilateral)
problems with later processing (recognition). - cant create structural description to map what they have in memory
Visual representations are intact, but cannot be accessed or used in recognition.
Lack of information about the percept.
“Normal percepts stripped of their meaning” (Teuber)
what tests associative agnosia
Patients do well on perceptual tests (degraded images, image segmentation), but cannot access names (“naming”) or other information (“recognition”) about objects. Agnosics fail to experience familiarity with the stimulus.
When given names of objects, they can (generally) give accurate verbal descriptions.
fail to have familiarity with the objects
Associative agnosics can copy drawings of objects but cannot …
name them (evidence for intactness of perceptual representations…)
what happens when agnosia is restricted to specific categories
Specific deficits in recognizing living versus non- living things.
Warrington and Shallice (1984): patients with bilateral temporal lobe damage showed loss of knowledge about living things (failures in visual identification and verbal knowledge).
Their interpretation: distinction between knowledge domains – functional significance (vase-jug) versus sensory properties (strawberry-raspberry).
what does Damasio say about inanimate objects
Many inanimate objects are manipulated by humans in characteristic ways.
Interpretation: inanimate objects will tend to evoke kinesthetic representations.
what did Gaffan and Heywood (1993) say about agnosia related to specific categories
Presented images (line drawings) of animate and inanimate to normal humans and normal monkeys, tachistoscopically (20 ms). Both participant groups made more errors in identifying animate vs. inanimate objects.
Interpretation: Living things are more similar to each other than non-living things - “category- specific agnosia”