perceiving and recognizing objects Flashcards
Orientation selectivity
Cells are tuned to detect lines in a specific orientation
Max to vertical contour
Fire less if tilted another way
receptive fields in the striate cortex
cells in the striate cortex respond best to bars of light (edges, lines of specific orientation, motion and size) , rather than to spots of light (which is what the retina and LGN prefer)
interested in the basic features of visual image
the pattern of illumination (or contrast), not the overall light level, is the primary concern of the visual system.
combining info to get more complicated info
Border ownership
when one object is in front of another There will be a visual border formed between the object and the background.
That border is “owned” by the object.
It is the edge of the object, not a property of the background
Extra striate cortex
Region of the cortex bordering the primary visual cortex and containing multiple areas involved in visual processing
Beyond VI
Just lies outside primary visual (striate) cortex
Receptive field properties
Stimulus (bar) length
Orientation
Width (spatial frequency)
Direction of movement
Colour
Ocular dominance
- Cortical neurons respond to both eyes but have a preferred eye (respond more to one eye than the other)
stimulus (bar length)
End stopping
neuron fires less if a bar does
not reach the outside edge of the receptive field or extends beyond the receptive field
Specific to bars of specific length
Contours
sudden transitions in an image
(luminance, colour)
hue can produce contour
the basic elements of visual perception
Without them, we would not see anything (Ganzfeld)
how do you find the edges of objects?
the receptive fields of cells in the primary visual cortex are too small
we are only picking up some - neurons only sensitive to picking up a small area
if we are going by the idea of contours the would have to touch to be the same object (house example of the snowman and car being apart of the house, but the windows are not) - so it is more complicated than this
lack of edge does not bother the visual system
Ganzfeld
People report being blind
Featureless environments
Happens to pilots when they go into a cloud
Snow blindness for skiers
Middle Vision
A loosely defined stage of visual
processing that comes after basic features have been extracted from the image (low-level vision) and before object recognition and scene understanding (high-level vision)
Involves the perception of edges and surfaces
Determines which regions of an image should be grouped together into objects
matching what we perceive with a memory—of something perceived in the past
organize elements of a visual scene into groups we can recognize as objects
Illusory contour
a contour that is perceived even
though nothing changes from one side of the contour to the other
People report seeing something not actually in the image (black is what’s actually seen) but will report seeing arrow
Contour not present, just something we are creating
People try to simply their environment
a line or edge we percieve even though it is not actually there
sometimes our brains fill in gaps to complete an image
Occulusion
notches In the circle seem to be supporting horizontal contour - arrangement of notches may imply that they are a part of a bigger scene
happens when one object obstructs or overlaps an object in the visual scene
our brain uses this so that we can interpret that the overlapping object is in front of the other one
depth cue
Structuralism
a school of thought believing that
complex objects could be
understood by analysis of
the components
Argued that perceptions are the sum of atoms of sensation – bits of colour, orientation, and so forth
Perception is built up of local sensations
Challenged by illusory contours
An extended edge is seen bridging a gap where
no local atom of “edgeness” can be found
Gestalt school
a school of thought stressing that
the perceptual whole could be greater than the apparent sum of the parts
Proximity
items that are near each other tend to group
Because they are closer together
Similarity
similar looking items tend to group
Colour, size, orientation, aspects of form
Conjunctions (combinations) of features do not work well
Texture segmentation
carving an image into regions of common texture properties
No hard contours dividing the regions
Divides images into half.border
Good continuation
a Gestalt grouping rule stating
that there is a tendency to perceive a line as continuing in its established direction
If two contours are close and collinear they are likely to come from the same contour
Closure
a Gestalt principle that holds that a closed contour is preferred to an open contour
The mind tends to fill in gaps in a visual image to perceive it as a whole. This means that if a shape is partially obscured or incomplete, we will still perceive it as a complete object.
Example: If you see a circle that is broken in a few places, your brain will interpret it as a complete circle rather than as separate segments.
Parallelism
parallel contours are likely to belong to the same group
regions with parallel contours are more likely to be seen as figure
Symmetry
symmetrical regions are more likely to be seen as a group
a symmetrical region is more likely to be seen as figure
Common region
items will group if they appear to
be part of the same larger region
Connectedness
items will tend to group if they are
connected
common fate
elements that move in the same direction tend to group together
synchrony
elements that change at the same time tend to group together
Pragnanz
people will perceive and interpret
ambiguous or complex images as the simplest form(s) possible
The most fundamental principle of Gestalt
aka: Good Figure, Simplicity
Ambiguous figure
a visual stimulus that gives rise to two or more interpretations of its identity or structure
Interpretate things depending
Every image is, in theory, ambiguous, but the perceptual committees almost always agree on a single interpretation…
Necker cube
an outline that is perceptually bi-stable
two interpretations continually battle for perceptual dominance
We can entertain either version
Figure-ground assignment
the process of determining that some regions of an image belong to
a foreground object (figure) and other regions are part of the background (ground)
One of the reasons a figure is ambiguous
Surroundedness
a rule for figure-ground assignment stating that if one region is entirely
surrounded by another, it is likely that the surrounded region is the figure
Size
the smaller region is likely to be figure
Relative motion
how surface details move relative
to an edge can also determine which portion of a display is the foreground figure and which is the
background
If one region moves in front of another, then the closer region is figure
Extremal edges
if edges of an object are shaded
such that they seem to recede in the distance, they tend to be seen as figure
Relatability
Degree to which two line segments appear to part of the same contour
Non accidental feature
A feature of an object that is not dependent on the exact (or accidental) viewing position of the viewer
Global superiority effect
the finding in various experiments that the properties of the whole object take precedence over the properties of parts of the object
Carve the retinal image into large-scale objects
Accidental viewpoint
a viewing position that produces some regularity in the visual image that is not present in the world
Perceptual committees assume viewpoints are not accidental
Slight shift in viewpoints
Ie/ leaning tower of Pisa from different angle wouldn’t look leaning
Dazzle camouflage
not to conceal but rather to make it difficult to identify as well as its range, speed and heading
Structural discription
a description of an object in terms of the nature of its constituent parts and the relationships between those parts
E.g., capital A: two flanking lines meet and a third line spans the angle created by those two lines
Many versions of structural-description hypotheses have been proposed…
Problem with templates
Need a lot to recognize same object in all forms and orientation
Recognition-by-components model
Biederman’s model of object recognition, which holds that objects are recognized by the identities and relationships of
their component parts
A version of a structural description hypothesis
Proposed that a set of geons (“geometric ions”) are combined to build perceptual objects
Visual system should be able to an object on basis of relationship to geons
The finite set of geons (~36) can be used to construct a very large number of object representations
Real object recognition is not viewpoint dependent
structural description
Structural description: a description of an object in terms of the nature of its constituent parts and the relationships between those parts
▪ E.g., capital A: two flanking lines meet and a
third line spans the angle created by those two
lines
▪ Many versions of structural-description
hypotheses have been proposed
Entry-level category
for an object, the label that comes to mind most quickly when we identify the object
Ie/ bird
Subordinate-level category
a more specific term for an object
Ie/ eagle
Superordinate-level category
a more general term for an object
Ie/ animal
More broad
Five Principles of middle vision
1) Bring together that which should be brought together
Gestalt grouping principles (e.g., similarity, proximity, etc.)
The processes that complete contours and objects even when they are partially hidden behind
occluders (e.g., the relatability heuristic)
2) Split asunder that which should be split asunder
Edge-finding processes
Figure-ground mechanisms
Texture segmentation
3) Use what you know
Implicit knowledge of the physics of image formation
4) Avoid accidents
Avoid interpretations that require the assumptions of highly specific, accidental combinations of features or accidental viewpoints
5) Seek consensus and avoid ambiguity
Using the “perceptual committees”, eliminate all but one of the multiple possibilities to deliver a single solution/perception
Overarching goal of the visual system
The visual system is trying to make sense of the vast and often ambiguous and noisy inputs from the early stage of visual processing
Camouflage
Animals exploit Gestalt grouping
principles to blend (or group) into their surroundings
Art of getting your features to group with the features of the environment
Same principles helping us to find, also used to hide them
Familiarity
things that form patterns that are familiar or meaningful are likely to be grouped together
Ie/ faces
Gestalt grouping rules
a set of rules describing which elements in an image will appear to group together
Describes retinals raw image
Reflect regularities in the world
Perceptual “Committees”
A host of rules, principles, and good guesses contribute to our organized perception of the world
Committees must integrate conflicting opinions and reach a consensus
Perception results from the consensus that emerges
Texture Segregatio
Segregation can occur based on shape, orientation, colour, motion, etc.
Segregation can occur based on shape, orientation, colour, motion, etc.
Segregation does NOT occur based on conjunctions
(combinations) of features
Computer vision
Computer-based edge detectors are not as good as humans
Sometimes computers do not find edges that humans see easily (e.g., illusory contours)
Sometimes computers find too many edges
Where is there a sudden transition
Naïve template theory
the proposal that the visual system recognizes objects by matching the neural representation of the image with a stored representation of the same “shape” in the brain
The idea that we recognize objects by matching every pixel or every low-level feature of the input to a representation in memory
Like a lock and key
Will not work
Too many templates are required
Prosopagnosia
an inability to recognize faces
May be able to recognize an object as a face but will not know who the person might be
May know it’s a face and that they are angry
Typically injury to nervous system
congenital prosopagnosia
form of face blindness present from birth
holistic processing
Process based on analysis of entire object or scene, not adding together a set of smaller parts or features
Process friends face as complex, not just as eye
Feedback and Re-entrant Processing
Perception and neural processing, more generally, is a two-way street involving feedback and re-entrant
processing
Precision is probably achieved by going down the pathway, once you have some information about the object, and interrogating earlier visual areas about the details of this instance of the object
We recognize the parts with information from the context of the whole, and we recognize the whole with information about the parts. Perception typically proceeds in both the bottom-up and top-down
directions at the same time
Where pathways
Dorsal, into parietal lobe
Location of objects in space
Actions required to interact with them (moving hands/eyes)
Deployment of attention
What pathways
Ventral, into temporal lobe
Receptive fields get bigger into temporal lobe
Lesion
Region of damaged brain
To destroy section of brain
Agnosia
A failure to recognize objects, in spite of being able to see them
Ability to see without recognizing
Typically due to brain damage
Psychic blindness
Homologous regions
Brain regions that appear to have the same function in different species
Regions of extrastriate cortex
Fusiform Face Area (FFA)
Activated by human faces
Extrastriate body area (EBA)
Activated by images of the body other than face
Parahippocampal Place Area (PPA)
Activated by images of places than by other stimuli. Like rooms with furniture
Visual word form area (VWFA)
Activated by images of written words
Feed forward process
Process carries out a computation one neural step after another without the need for feedback or a later—earlier stage
Reverse hierarchy theory
Argues feed forward processes give you a general, categorical impression of the work, but not details
Bayesian Approach
Prior probability— how likely consistent with hypothesis
Subtraction method
Showing brain activity measured in two conditions, one with involvement of mental process of interests and one without
Difference between the two will show regions of brain specifically activated
Decoding
Determines nature of stimulus from pattern of responses measured
what are the refractory components of the eye
cornea, aqueous humour, lens , vitreous humour
dorsal pathway
where pathway
heading toward parietal lobev
ventral pathway
what pathway
heading toward temporal lobe
what are the three areas that gestalt school investigates
laws of grouping
the “goodness” of figures
figure-ground relationships
overarching goal of the visual system
the visual system is trying to make sense of the vast and often ambiguous and noisy inputs from the early stage of visual processing
the pandemonium model
simple model of letter recognitions
demons loosely represent neurons
each level is like a different brain area
feature demon, cognitive demons and decision demon
feature demons
oriented lines and curves that respond to the letter
cognitive demons
letter features
ie/ if the letter is A (H and X have some features of this letter)
decision demon
pool info from other demons and chose the loudest demon as the answer
viewpoint invariance
a property of an object that does not change when observer viewpoint changes
a class of theories of object recognition that proposes representations of object that do not change when viewpoint changes
advantage over templates
if we can derive the same structural description from any encounter with the object then we need to store only one representation of the object in memory
problems with evaluation of structural description theories
Problems:
▪ Object perception is not completely viewpoint
invariant
▪ E.g., the farther an object is rotated away
from a learned view, the longer it takes to
recognize
▪ Geons (or any of the other “alphabets” of
structural-description models) do not always
provide adequate descriptions of objects
▪ E.g., book vs cigar box
▪ A structural description would have to be just a part of
the answer to the problem of object recognition, not
the whole solution
recognition by committees
object recognition is not a single process
there may be several object recognition processes depending on the category level
entry level
subordinate-level
superordinate-level