Chapter 4 Recognizing Visual Objects Pg 117 Flashcards
Object variety
the world contains an enormous variety of objects
*Flexibility, without restrictions
Variable views
Different retinal images that can be projected by the same object or category of objects
Representation
Pattern of neural activity in brain that contains information about a stimulus and give rises to subjective perceptual experience of that stimulus
Recognition
Process of matching the representation of a stimulus to a representation store in long-term memory based on previous encounters with that stimulus or with similar stimuli
Perceptual organization
- Represent edges- abrupt, elongated changes in brightness and/or color
- Represent uniform regions bounded by edges
- Divide regions into figure and ground, and assign border ownership
- Group together regions that have similar properties (e.g. color). Groups consisting of “figure” regions are represented as candidate objects; other groups of regions are represented as background.
- Fill in missing edges and surfaces- that is, edges and surfaces that are partly occluded- to obtain more complete representations of candidate objects.
Object recognition
Use higher-level processes to represent object fully enough to recognize them by matching their representations stored in memory
Figure
Region of an image that is perceived as being part of an object
Ground
Region of an image that is perceived as part of background
Border ownership
Perception that an edge, or border, is “owned” by a particular region of retinal image
Perceptual grouping
Process by which the visual system combines separate regions of the retinal image that “go together” based on similar properties
Perceptual interpolation
Process by which the visual system fills in hidden edges and surfaces in order to represent the entirely of a partially visible object.
Edge extraction
visual system determines the location, orientation, and curvature of edges in retinal image, based on patterns of responses from neurons in area V1, V2, V4-> “What” pathway
Uniform connectedness
Characteristic of regions of the retinal image that have approximately uniform properties.
Depth
Front (Figure)
Back (ground)
Surrounding
surround region (owning borders, figure)
Symmetry
Region with symmetrical borders ->figure
Convexity
Convex borders, outward-bulging region(figures)
Concave, inward-going region (background)
Meaningfulness
Figure-ground organization precedes object recognition
- Regions perceived as figure are meaningful- they correspond to object shapes store in memory
- Recognizes object shapes prior to assignment of border ownership and determination of figure-ground organization
Simplicity
Number and placement of shapes composing image
Proximity
Elements that are close together group more easily than elements that are far apart
*Things that are near to each other are grouped together
Similarity
Similar elements tend to group together
*Similar things are grouped together
Common motion (Common fate)
Elements that move in unison are likely to be perceptually grouped
*Things moving in same direction are grouped together
Symmetry and Parallelism
symmetrical/parallel tend to group together
Good continuation
two edges that would meet if extended are perceived as single edge that has been partially occluded, like the top and bottom edges of horizontal bar.
- Edges that are aligned, or follow the smoothest or straightest path, are seen as part of the same contours.
- Points that, when connected, result in contours. These contours follow the smoothest path.
Synchronized neural oscillations
Those neurons could indicate that the regions belong together by synchronizing their oscillations- by producing clumps of spikes at the same time
*Similarity of orientation, good continuation, common motion
Edge completion
The perception of a partially hidden edge as complete; one of the operations involved in perceptual interpolation
Illusory contours
nonexistent but perceptually real edges perceived as a result of edge completion
*Result of an explicit perceptual representation quite early in the visual stream
Surface completion
The perception of a partially hidden surface as complete; one of the operations involved in perceptual interpolation
Heuristics
In perceptual organization, rules of thumb based on evolved principles and on knowledge of physical regularities
*It’s always possible to reject he most likely interpretations of how a scene is organized perceptually, in favor of a less likely interpretation based, perhaps on subtle clues
Perceptual inference
In vision, the interpretation of a retinal image using heuristics
*Using heuristics in a way, to guide the interpretation of a retinal image based on knowledge of physical regularities in the world
Shape representation in V4
Edges, preferred orientation
Shapes are represented in V4 by the combined activity of all the neurons responding to the contour fragments making up the shape
Population code representing entire shape
Inferotemporal (IT) cortex
increase receptive fields than V4
- Almost entire retinal image
- More complex
- Specific combinations of contour fragments
- Structural description
Grandmother cell
A neuron that responds to a particular object at a conceptual level, firing in response to the object itself, a photo of it, its printed name, and so on
Throughout the visual hierarchy
V1+V2= detailed info about locations of edges of slots and handle of toaster V4= info about curvature and orientation of toaster's various contours IT= more complex aspects of toaster's shape, highest-level neurons, invariant representation of the concept "toaster"
Modular coding
Representing of an object by a module, a region of the brain that is specialized for representing a particular category of objects
- Certain categories of objects can strongly activate specific brain regions
- Fusiform face area (FFA): faces
- Parahippocampal place area (PPA): buildings, outdoor scenes
- Exrastriate body area: human and animal bodies
Distributed coding
Representation of objects by patterns of activity across many regions of the brain
Visual agnosia
Impairment in object recognition
Prosopagnosia
A type of visual agnosia in which the person is unable to recognize faces, with little or no loss of ability to recognize other types of objects
Topographic agnosia
A type of visual agnosia in which the person is unable o recognize spatial layouts such as buildings, streets, landscapes, and so on
Top-down information
Flow of information from higher regions to lower regions
- Goals, attentions, knowledge
- Visual system first creates a representation of general, overall layout of the scene and tries to match that with representations of the general layout of specific with representations of the general layout of specific categories of scene stored in memory
Bayesian approach
In object recognition, the use of mathematical probabilities to describe the process of perceptual inference
- The visual system unconsciously combines two probabilities in order to infer what type of scene produced the currently experienced retinal image
(1) Prior probability of all possible scenes and
(2) for each possible scene, the probability that it produced the current retinal image
Feature-Based Approach
- Identifying the most prominent anatomical features of a face and the spatial relations among them
- Differences in pose person’s expression, lighting, shadow, distance from face to camera
- Statistical method for finding the best match
Holistic Approach
Matching an image of a test face as a whole with images of known face as a whole with images of known faces in database
- Eigenfaces: face images generated from a set of digital images of human faces taken under the same lightening conditions, normalized to line up the eyes and mouths, and rendered with the same spatial resolution
- Start with average face and add a weighted sum of the other eigenfaces
Hybrid Approach
Combines the best aspects of the holistic and feature-based approaches
*Eigenfaces were supplemented by eigen-feature representations
Image clutter
a characteristic of visual scenes in which many objects are scattered in 3-D space, with partial occlusion of various parts of objects by other objects
Photoreceptors
point-by-point “pixels”, separate arrays for each of 4 photoreceptor types
Retinal midget-ganglion cells
Small, edge-enhancing receptive fields, high spatial resolution
Early cortical cells
Respond to oriented edges and lines, size selectivity, retinotopic maps
How does the brain construct representations of objects?
- Using local features to guide global interpretations
- e.g., fake spiral (Frasier) illusion, impossible figures
- more inductive, bottom-up
- Using global interpretation to guide local interpretations
- e.g. Necker cube, reversible figures
- more deductive, top-down
Figure-ground segregation
Determining what part of environment is the figure so that it “stands out” from the background.
Border assignment
to what region does a particular edge (border) belong
Figure-ground organization
- The figure is more “thinglike” and more memorable than ground.
- The figure is seen in front of the ground.
- The contour separating figure from ground belongs to the figure (border ownership)
Some specific factors that can determine which region is figure
- Depth: region in front of another
- Regions surrounded by other(s)
- Regions that are symmetrical
- Regions that have convex (bulgy) shapes
- Regions that have meaningful shapes
- Regions located in the lower part of scene
- Regions that are small
- Regions oriented vertically and/or horizontally
Structuralist Approach
- Wilhelm Wundt
- States that perceptions are created by combining elements called sensations
- Popular in mid to late 19th century
- Wundt studied conscious experience by examining its structure or component parts (sensations, feelings) using individuals who were trained in introspection. This “school of psychology” became known as structuralism.
- Could not explain apparent motion, illusory contours,
Kanizsa Triangle
- First described by Gaetano Kanizsa in 1955
- 2 equilateral triangles appear present except non are actually drawn
- The “inside” of the top triangle appears to be a brighter shade of white than the rest.
Gestalt psychology
It seems that the visual system “looks” for regularity, patterns, and structure in particular ways because they are generally useful (accurate, efficient) in constructing perceptions of the world.
Gestalt approach
- The whole is different than the sum of its parts.
- Perception is not built up from sensations but is a result of perceptual organization.
- Specific principles or laws of perception
- Describe how elements in a scene tend to group together.
- Better thought of as heuristics, “best-guess rules”, rather than fixed laws or algorithms
- Pragnanz: every stimulus is seen as simply as possible
- Also called the “law of good figure”
- The central law of Gestalt psychology
Common region
- Gestalt Principles of Perceptual Organization
* Elements in the same region tend to be grouped together
Connectedness
- Gestalt Principles of Perceptual Organization
* Connected regions of visual are perceived as single unit
Synchrony
- Gestalt Principles of Perceptual Organization
* Elements occurring or disappearing at the same time are seen as belonging together
Theory of unconscious inference
- Hermann von Helmholtz
- Why stimuli are interpreted the way they are and can be interpreted in more than one way
- Perceptions are the result of unconscious assumptions about the environment
- Likelihood principle: objects are perceived based on what is most likely to have caused the pattern
- Opposed to Structuralist views, anticipated aspects of Gestalt views
Bayesian approaches
Modern researchers
*take assumed prior probabilities into account, to model perceptual inference.
Physical regularities
Regularly occurring physical properties
- Border ownership
- Figure-ground organization
- Perceptual grouping
Tilt and size aftereffects
- tuned mechanisms: neurons responsive to particular narrow ranges of orientation or size
- Represents size and orientation by the pattern of activity
- Selective adaptation desensitizes some of these tuned neurons/mechanisms