Chapter 5 Percieving Objects and Scenes Flashcards
What is Object Recognition?
Detecting objects in an image and then matching those objects to existing stored representations of what those objects are.
Why is it so difficult to design a perceiving machine?
Overall: the mechanisms behind visual perception are remarkably complicated and not fully understood so replicating them is all but impossible
Specifically, things that make it difficult:
- The inverse projection problem
- Occlusion/hidden objects
- View Point invariance
- Blurred Objects
What is the Inverse Projection problem?
This is the task of determining the object responsible for a particular image on the retina. It is a problem because lots of 3D images cast the same 2D image on the retina. This problem is compounded when you add the fact that, due to view point variance, an object doesn’t always leave the same 2D image on the retina .
What is occlusion?
Occlusion: when objects in our visual field overlap with other objects, blocking out sections.
What is Viewpoint invariance?
Our ability to recognize objects from various viewpoints and directions.
What is perceptual organization and what are the two main processes involved?
Perceptual Organization: the process by which elements in a person’s visual field become perceptually grouped and segmented to create a perception.
Two Processes involved:
1. Grouping: the process by which elements in the visual scene are put together into coherent units or objects, putting visual events together in a way that produces meaning
- Segregation: the process of seperating one are or object from another; how we determine what things don’t go together.
These two processes work together to shape our perception.
What is Structuralism and how does it relate to Gestalt psychology?
Structuralism: sees sensation and perception as distinct processes where many sensations combine to create complex perceptions. Gestalt psychology rejects this approach
What is apparent Motion?
An illusion where something appears to move but does not actually move. It occurs when two images are flashed in separate locations with a moment of darkness between the flashes. The brain fills in the darkness with movement.
What are the 7 Gestalt Principles of Perceptual organization?
- The Principle of Good Continuation: points that, when connected, result in a straight or smoothly curving lines are seen to belong together. The lines tend to be seen in a way that follows the smoothest path. We see stimulus as continuing even when they are obscured.
- The Principle of Pragnanz: aka the principle of simplicity or of good figure. Every pattern will be seen in such a way that results in the simplest structure as possible
- The principle of similarity: similar things appear to be groups together
- The Principle of Proximity: Things that are near each other appear to be grouped together.
- Principle of Common Fate: things that move in the same direction tend to be grouped together
- Principle of Common Region: elements in the same region of space appear to be grouped together, if they are contained within the same boundary, we see them as being grouped together
- Principle of Uniform Connectedness: a connected region of the same visual properties (e.g. light, colour, texture, motion etc.) is perceived as a single unit.
What are illusory contours?
Perceiving edges that aren’t actually there.
What is figure-ground segregation? How does it relate to border ownership?
When we see a separate object, we see it as a figure that stands out against the background.
Border ownership is the idea that we see borders as belonging to the figure.
What do we use as figure cues?
Figure cues are how we decide if something is the figure. They include
- Location: areas in the lower field of vision are often perceived as the figure
- Figures are likely perceived as the convex side of borders (The bulge)
- Experience: we assign figure status to the part of the visual field that creates more meaning.
- Figures are more thing like
What is Recognition by Components Theory?
RBC theory is the idea that objects are composed of individual geometric shapes called geons and we recognize objects based on the arrangement of those geons.
This theory suggests there are 36 geons and it explains how we are able to recognize objects from different angles. It does not explain how we are able to distinguish within groups though (e.g. identifying types of birds, or different mugs)
What are the two components of a scene? How do we distinguish between objects and scenes?
A scene is a view of a real world environment that contains both a background elements AND multiple objects that are organized in a meaningful way relative to each other and the background.
Objects are acted upon, scenes are acted within.
What is the gist of a scene and how do we perceive them so quickly?
The gist of a scene is a general description of a scene that we get after only viewing the scene for a fraction of second (250ms). We perceive the gist a scene before we perceive objects and details. We use global image features to perceive the gist so quickly.
What are the five global image features?
- Degree of naturalness: natural scenes have textured zones while man made scenes have straight lines with lots of horizontal and vertical orientations
- Degree of openness: open scenes have visible horizons and few objects
- Degree of roughness: smooth scenes have few small elements, rough scenes contain many small elements and are more complex
- Degree of expansion: the convergence of parallel lines, how far back or away it goes
- Colour: having lots of a characteristic colour (e.g. forests have lots of green, hospitals have lots of white)
What are two regularities that influence our perception?
- Physical regularities: Frequently occurring physical properties of the environment.
- E.g. The oblique effect: peopel perceive horizontal and vertical lines more easily than any other orientation because
- E.g. Uniform Connectedness: Objects are defined by areas of the same colour/texture
- E.g. Light from above heuristic - Schematic Regularities: Things that should be based on meaning; characteristics associated with activities that are common in different types of scenes.
- E.g. Likelihood Principle
What is the likelihood principle?
We perceive the object most likely to have caused the pattern of stimulation we received. We make this judgement by unconscious inference; our perceptions are the result of unconscious assumptions that we make about the environment.
What is the Bayesian Inference?
Our estimate of the probability of an outcome is determined by 2 factors:
- The prior probability
- the extent to which the available evidence is consistent with the outcome.
What is predictive coding?
The theory that describes how the brain uses past experiences to predict what we will perceive
What is the role of the following brain areas in object/scene recognition?
Lateral occipital complex: LOC
Fusiform Face Area
Extrastriate Body Area
Parahippocampal Place Area
LOC: Is part of the ventral pathway and is involved in recognizing objects
FFA: located in the fusiform gyrus, is integral in facial recognition
EBA: Activated by pictures of bodies and parts of bodies
PPA: activated by places and is important for information about spatial layout