Chapter 4: Recognizing Visual Objects Flashcards
Object Familiarity
visual system ust match a mental representation of an object to a representation stored in memory
- doesn’t have to be perfect image
Image Clutter, Object Variety, and Variable Views
each view represents a complication for the visual systems to resolve in order to identify objects in the environment
Variable Views
different retinal images that can be projected by some object or category of objects
Image Clutter
characteristics of visual scenes in which many objects are scattered in 3D space, with partial occlusion of various parts of objects by other objects
- can tell what’s happening without seeing all details
- what we’re seeing sometimes have many missing details, but we still know what’s going on
Object Variety
refers to fact that world contains enormous variety of objects
Representation
gives rise to subjective perceptual experience of that stimulus
- contain info about increasingly complicated aspects of retinal image
- visual system maintaining visual components (eg.color, edges)
Recognition
refers to the process of matching the representation of a stimulus to a representation stored in long-term memory
Perceptual Organization involves
- identifying edges- abrupt/ elongated changes in brightness and/ or color
- identifying regions bounded by those edges
- determining what objects owns the boundaries (establishing figure and ground)
- grouping similar regions (perceptual grouping)
- handling missing sections by determining what to fill them with (perceptual interpolation)
Object Recognition
uses higher-level processes to represent objects fully enough to recognize them
Figure
region of image perceived as being part of object
Ground
region that is perceived as background
Visual System
- must combine, or group together, the separate regions, based on similarity of properties– regions 13-20 are all the same color
- must “fill in” the parts of the object that cannot be seen due to occlusion
Perceptual Grouping
process by which visual system combines separate regions of retinal image that go together based on similar properties
- separating things in visual system
- combine image regions into wholes
Perceptual Interpolation
process by which visual system fills hidden edges and surfaces in order to represent entirety of partially visible objects
Perceptual Organization
refers to visual system’s way of dealing with scenes containing multiple overlapping objects
- makes object recognition which complex scenes possible
- representing edges and regions
- edge extraction
- uniform connectedness
Edge Extraction
process by which visual system determines location, orientation, and curvature of edges in retinal images
Uniform Connectedness
characteristics of regions of retinal image that have approximately uniform properties
- helps put things together
First step in perceptual organization is to represent scenes
- neurons in areas V1, V2, and V4 of the “what” pathways are responsible for extracting edges from the visual field
- lateral inhibition
- uniform connectedness
Edges and Simple Shapes in a Retinal Image
this illustrates the retinal image of a scene consisting of four dark gray shapes on a lighter gray background
Figure- Ground Organization: Assigning Border Ownership
- Principle of figure-ground organization accomplishes edge extraction
- determines which objects a border belongs to– critical to figure- ground segregation - Visual system principles used to assign border ownership and organize visual scenes into figure and ground
- Depth, surroundedness, symmetry, convexity, meaningfulness, simplicity - Visual system principles
- depth occurs when one region is perceived to be in front of another
- a front region owns the border between regions and is perceived as the figure
- other region is perceived as ground
Border Ownership
perception that edge/ border is owned by particular region of vertical image
V2 Processing
V1 and V2 include specialized networks that allow important info about border ownership and figure- ground organization to be computed and transmitted rapidly among cells whose combined receptive fields over contiguous areas of visual scene
- V2 neurons play role in border assignment
Gestalt Laws
- while objects are often described as being grouped (belonging to a group), word “figure” could be easily substituted
- in general, the figure is what the organism pays most attention to; the ground tends to be ignored
- the neural basis for border assignment seems to stem from neurons in V2
- similar results were found with humans in fMRI experiments
Gestalt Principles used to group regions
- Proximity- elements that are close together group more easily than elements that are far apart
- Similarity- similar elements tend to group together
- Common motion (common fate)- elements that move in unison are likely to be perceptually grouped
- Symmetry and parallelism- things that are symmetrical or parallel group together
- Good continuation- two edges that would meet if extended are perceived as single edge that has been partially occluded
Good Continuation
visual systems idea that things that go together when they move together
Neural bases for perceptual grouping
grouping may be due to neurons working together (synchronized neural oscillations)
- research results suggest that three important principles of perceptual grouping may be represented by synchronized neural oscillations
- similarity (or orientation), good continuation, and common motion
Synchronized Neural Organization
- neurons produce spike in temporal pattern
- produce clumps of spikes at same time
Perceptual Interpolation
intelligently filling in edges and surfaces that aren’t visible, because they’re occluded by other elements, but sometimes they also blend in with the background
- two different operations with somewhat different perceptual operations work in perceptual interpolation
- finding edges
- completing surfaces
Edge Completion
perception of partially hidden edge as complete
Illusory Contours
nonexistent but perceptually real edges perceived as result of edge completion
- result of explicit perceptual representation early in visual stream
- putting in edges that don’t actually exist
Surface Completion
perception of partially hidden surface as complete
Neural Basis of Perceptual Interpolation
Neurons in are V2 have been shown to respond to illusory contours
- V2 neuron receptive field location and preferred orientation were determined
- Areas outside receptive field were presented two bars at preferred orientation that were moved up to induce illusory edge
Perceptual Organization Reflects Natural Constraints
- these were cases when perceptual organization may not be accurate. many such situations are human-made, often using heuristics
- Heuristics- rules of thumb based on evolved principles and on knowledge of physical regularities - the principles of organization provide information needed to create camouflage
- occurs whenever the figure blends into the background rather than stands out from it
Perceptual Interference
- interpretation of retinal image using heuristics
- involves using heuristics to guide the interpretation of a retinal image based on knowledge of physical regularities in the world
Object Recognition
Visual system may recognize an object as being the same despite changes in retinal image
Two approaches to object recognition:
- Single representation is activated when an object is seen - Objects are represented in a view-specific manner
Invariance
Visual system that can recognize object as being same despite changes in retinal image
Recognition by Components
Proposes that recognizes an object depends on first identifying primitive geometric components that make up object
Representation of a Curved Edge by a Neuron in Area V4
Complex contours could be represented by neurons in V4 that combines responses of multiple V1 neurons, perhaps with input from neurons in V2
Properties to which neurons in area V4 responds
Evidence is presented about the stimuli neurons in V4 and the inferotemporal (IT) cortex responds to help with object recognition
Individual neurons in area V4 respond mot strongly to […] that can be more complex than those in V1
Individual neurons in area V4 respond mot strongly to edges that can be more complex than those in V1
- Edges to which V4 neurons respond can be straight or curved
- V4 neurons have preferred orientation, but contour with preferred orientation will elicit strong response only if contour is at angular position relative to entire shape that contour belongs to
A Shape-Tuned Neuron in Area V4
- V4 neurons have preferred location in retinal image, but preferred location covers larger area of retinal image
- Shapes are represented in V4 by combined activity of all neurons responding to contour fragments making up shape
A V4 population code for shape
- V4 neurons- tuned to specific curvatures and orientations, located in specific part of retinal image
- IT neurons- respond most strongly to specific combinations of contour fragments, located almost anywhere in visual field
Based on structural description- description that specifies set of parts (contour fragments) and spatial relations
Grandmother cell
- exhibit high degree of invariance with respect to location
- neuron that responds to particular object at a conceptual level, firing in response to the object itself, a photo of it, its printed name, and so on
- cells (no one specific cell, but group of cells) that are so specifically tuned that they respond to a particular object
- Individual neurons in medial temporal lobes
Object recognition in brain
Areas V1 and V2 (detailed info about precise location) and V4 (curvature and orientation) create increasingly complex representations of edges regions and shapes
- responses of neurons throughout visual hierarchy contribute to our experience
Lateral Occipital Cortex
Activity in region is not dependent on size, position, or other features
Ways the visual system represents objects
Modular coding: representation of object by module, region of brain is specialized for representing particular category of objects
Distributed coding: representations of objects by patterns of activity across many regions of brain
Face- Selective Region in the Human Inferotemporal Cortex
An fMRI experiment measured activity in the human brain as subjects viewed alternating 30- second sequences of faces versus other types of objects
- Fusiform face area (FFA)- found on fusiform gyrus along lower surface of temporal lobe
- Parahippocampal place area (PPA)- activated by buildings and outdoor spaces
- Extrastriate body area- activated by human and animal body parts
Problems with Object Recognition
- Visual form agnosia- impairment in object recognition
- damage early on in ventral pathway
- may come from trauma - Prosopagnosia- person is unable to recognize faces
- Topographic agnosia- person is unable to recognize spatial layout such as building streets and landscapes
- Neurons outside of category- specific modules carry info about whether viewed object belongs to that category
Hierarchical Coding of Object Information
Brain uses combination of modular and distributed coding
Top- Down Information
- opposite of bottom-up processing
- flow of info from higher to lower regions
- emphasizes the perceiver’s goals, attention, knowledge, and expectations on perception
- is combined with bottom- up information in ventral pathway to speed up the process of fully recognizing the objects in a scene
Gist of a scene
-Improves recognition of the objects in the scene and at same time, recognition of objects in scene improves perception of gist
- whenever object violated assumption between object and scene, object was more likely to be misidentified or missed
- visual system creates representation of overall layout of scene and tries to match that with representations of layout of specific categories of scene
Unconscious Interference and the Bayesian Approach
Bayesian approach- use of mathematical probabilities to describe the process of perceptual inference
- Visual system of unconsciously combines two probabilities to infer what type of scene produced the currently experiences retinal image
- prior probability of all possible scenes
- for each possible scene, the probability that it produced the current retinal image
Automatic Face Recognition
Feature- based approach- focuses on identifying most prominent anatomical features of face and spatial relations among them
Holistic approach- uses eigenfaces- face images generated from set of digital images of human faces under same lighting, normalized to line up eyes and mouths and rendered same spatial resolution