C4 recognition Flashcards
Recognition in the wider context of cognition/definition
Recognition is the process through which a set of basic sensory descriptions (2½D) of an object are turned into a 3D description that matches what has been seen before (“re-cognizance” = “re-knowing”), irrespective of the angle it’s seen from.
This process involves:
Converting the sensory stimuli to an internal representation and storing this;
Comparing what is sensed (seen/heard/etc.) to what has been experienced before;
Identifying what is perceived, irrespective of its orientation (object-centred description).
Humphreys and Bunce proposed a 5 step model of recognition
Humphreys and Bunce 5 step model of recognition
- Early visual processing
similar to Marr’s raw and full primal sketches - Viewpoint-dependent object descriptions
similar to Marr’s 2½D sketch - Perceptual classification
“what kind of thing is it ?” e.g. it’s a book - Semantic classification/categorisation
“what particular thing is it ?” e.g. it’s a heavy blue paperback - Naming
“which thing is it ?” e.g. it’s the DD303 course book
Types of recognition - Object and face recognition
Two different types of recognition:
•Between-category: “What” something is, e.g. it’s a person, not a vehicle
• Within-category: “What its name is”, e.g. it’s Sigmund Freud
Object recognition and face recognition are considered separately:
• Within-category is more often used with people, objects less so (“It’s an orange, it (probably) doesn’t have a name”);
• An individual face can change due to time, emotion etc.
In studying face recognition a distinction is made between familiar faces and unfamiliar faces:
• Pike et al., found that:
• People can often identify poor representations of famous faces;
• Our recognition accuracy for unfamiliar faces (e.g. in an identity parade) is poor.
Recognising someone’s face appears to use different cognitive processes to recognising the emotion they’re displaying (Young et al.)
Types of recognition - Active processing - recognizing objects by touch
Recognition is an active process (Gibson), even when using visual recognition we actively engage with the environmental stimuli (e.g. visual scanning of a Penrose triangle).
This is even more obvious in the process of recognising objects by touch:
• The brain and touch receptors in the skin form a feedback system where the pressure we apply is regulated by the brain based on the sensory information generated from tactile exploration of an object;
• Information from stretch receptors in muscles enables the location of limbs to be calculated (kinesthesis);
• The sense of proprioception enables the relative location of body parts to be calculated (“how far is my finger from my nose with my eyes closed ?”);
• All this haptic information can be used to generate a mental image of an object;
Lederman and Klatzky (1990) showed that humans use consistent exploratory procedures when examining objects, such as enclosing them, stroking the texture, pressing to gauge hardness etc.
While haptic perception is useful for recognising objects by weight, hardness etc., visual perception can operate at greater distances.
Types of recognition - Recognizing two-dimensional objects
Recognising 2D images may use different cognitive processes to 3D-object recognition.
Different types of theory have been proposed to explain 2D recognition:
Types of recognition - Recognizing two-dimensional objects - Pattern matching theories
The sensed image is compared to a range of templates in memory until a match is found
This seems an unlikely explanation of 2D recognition as it would require either very generic templates or a large number of templates to handle the enormous variety of similar patterns (e.g. of the letter “R”)
Types of recognition - Recognizing two-dimensional objects - Feature recognition theories
The key features of the image are extracted and compared to internal representations until a match is found
This is more generic than pattern matching, so a better explanation;
However ambiguity is problematic - e.g. “a line and a curve” could describe a D, G, P or Q.
Types of recognition - Recognizing two-dimensional objects - Structural description theories
Structural descriptions comprising the key features and how they are organised in relation to each other are compared to internal representations until a match is found
Appears to cope with variety and ambiguity;
Can be described in human and computer language;
Also works with recognition of 3D versions of 2D objects, e.g. a 3D letter “L” has the same structural descriptions irrespective of orientation.
Types of recognition - Object-centred vs viewer-centred descriptions
Because 3D objects can be rotated or viewed from many different angles, 3D recognition cannot be explained by theories such as pattern matching or feature recognition that don’t consider the relative location of object parts.
The description that is tested against prior knowledge (and the internal representations in which that knowledge is stored) must be object-centred - i.e. describe the object generally, rather than viewer-centred, otherwise the object could only be recognised from one angle of view.
Recognizing three-dimensional objects - Marr and Nishihara’s theory
Marr and Nishihara suggested three-dimensional objects are recognised by breaking them up into generalised cones.
(A generalised cone is any round-sided 3D solid with the same cross-sectional shape (not necessarily size) throughout its length. For example cutting across a cone or vase shape produces circular cross-sections)
This analysis allows any object to be described within a canonical coordinate frame, i.e. the process can be used to describe all objects in the same standard way.
They proposed a multi-step process:
Recognizing three-dimensional objects - Marr and Nishihara’s theory - step one
Step 1: Derive the object shape
Identify the central axis of the object using information from the 2½D sketch;
Work out what shape would result if the silhouette or contour generator of the object was rotated around the central axis (e.g. a rectangular silhouette rotated around its central axis would produce a cylinder, a triangle would make a cone);
For this mental process to result in an accurate conclusion of what the 3D object is like depends on three assumptions:
•each point on the silhouette matches only one point on the 3D object;
• points near each other in the 2D image are near each other in the 3D object;
• points on the silhouette all lie in the same plane.
If any of these assumptions do not hold the object may be incorrectly recognised:
Example: a hexagonal prism viewed end-on and a cube viewed edge-on and tilted forward have the same silhouette:
These violate assumption 3, because the points marked “a” in the hexagon are coplanar while points (a, b, c) in the outline of the cube lie in three different planes; Consequently a cube may be mis-recognised as being a hexagon.
Recognizing three-dimensional objects - Marr and Nishihara’s theory - step two
Step 2: Locate the objects component axis/axes and derive a 3D description
Work out the areas of concavity (where the silhouette ‘bends in’);
Divide the object into component parts (primitives) by joining the areas of concavity;
Find an axis for each primitive;
Link all the primitives to form a 3D description by working out how each of their axes relates to the horizontal axis of the object.
Simple object outlines (e.g. a circle) don’t have any areas of concavity so the axis of symmetry is used instead.
Complex objects represented as a hierarchy of primitives allows for general recognition (“It’s a person”) as well as capturing detail (“Four limbs hang off their body, each ends in fingers/toes, and a head sticks out the top”)
Recognizing three-dimensional objects - Marr and Nishihara’s theory - step three
Step 3: Compare the 3D description to a mental catalogue of objects to find a match
The 3D description is compared against a mental catalogue of 3D models of all previously seen objects;
This catalogue is hierarchical, with more detail at each level;
If a match is found, the process stops and the object is recognised.
Recognition does not depend on viewing angle as the description and model entries are all 3D representations.
Recognizing three-dimensional objects - Marr and Nishihara’s theory - evidence for
Marr and Nishihara’s claims that locating the central axis is critical to recognition is supported by evidence:
Lawson and Humphreys (1996) showed that recognition was adversely affected in line drawings that were rotated so that their major axis was foreshortened (i.e. rotated towards the observer), maybe because this made it difficult to locate;
Warrington and Taylor (1978) found patients with right-hemisphere focal lesions had difficulty recognising objects presented from an unusual viewpoint, or confirming that two photos were of the same item if one showed an unusual view;
• They may have been unable to convert the 2D image to a 3D object-centred representation;
• Features that were important to identification may have been obscured by the rotation.
Humphreys and Riddoch (1984) used foreshortened images and others of the same objects where features were hidden - the foreshortened ones were recognised less, suggesting that major axis identification is important to forming the 3D model.
Explains misinterpretation if the contour generator is misidentified (Step 1)
Recognizing three-dimensional objects - Marr and Nishihara’s theory - evidence against
Within-category discrimination is hard to explain because the conversion of an object to generalised cones should map all exemplars of the category to the same representation. This would mean that we can’t tell the difference between one instance of a thing and another (e.g. all border collies would be recognised as the same thing)