Object Perception LOs Flashcards

1
Q

Template approach theory

A

compares input to a model/template previously stored in memory, stimulus categorized to be exact match.

PRO: successfully used by machines (eg. reading MICR numbers at the bottom of a cheque)

CON: cannot handle novel stimuli, cannot handle variations within a stimulus, too many templates required, cannot handle context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What are the theories of Object perception

A

Template approach, prototype approach, pandemonium, Marr’s approach, recognition bym components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Prototype approach

A

individual instances not stored, represented as prototype (abstraction of typical or next example of object) , categorization based on distance between perceived item and prototype

PRO: more flexible than templates

CON: cannot handle context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pandemonium

A

stage 1: Image demon: gets sensory input.
stage 2: feature demons: analyze input in terms of features; each activated by its specific feature.
Stage 3: Cognitive Demons: determines which patterns of features are present, corresponding to known objects.
Stage 4: Decision Demon: identifies pattern by listening for the cognitive demon shouting the loudest

PRO: can identify a wide range os stimuli - just specify component features, feature-detectors physiologically relate to cells in the visual system

CON: doesn’t define features, cannot handle organizational principles (Gestalt Laws), cannot handle context effects, cannot be applied to 3-D objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Marr’s Approach

A

defines object with respect too object itself (object-centered), determine objects primary axis using generalized cones (have an axis of orientation, certain location/centre of mass, overall size).

Create shape descriptions of object at diff levels of detail.

Each level of hierarchy contains info about: axes of cones, arrangement of axes of component ones, internal reference to 3-D description of component models

3-D model description: object centered, invariant over changes inm position of viewer (viewpoint invariance)

Object identification: finds match between 3-D model description and stored catalog of 3-D models of known objects

Specificity Index (level of detail): searches through hierarchy of stored info until info in model and in catalog hae same level of specificity - bottom up (eg. object -> biped -> human -> male or female)

Adjunct (subcomponent) index (whole-to-parts reference): relations info about components (location, orientation, relative sizes) to help determine object TOP DOWN (eg. human -> arm -> forearm -> hand -> David)

Parent (supercomponent) index (parts-to-whole reference): as each component is identified it provides info on what the whole object is likely to be TOP DOWN (eg. hand -> forearm -> arm -> human -> David)

PRO: doesn’t rely on catalog of features, is economical, handles variation and novel stimuli, allows for top-down processing, accounts for organizational principles (gestalt laws)

CONS: physiological evidence is questionable, identifies objects by gross features not details

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Recognition By Components

A

assumes that the visual scene can be decomposed into constant basic elements, these components are called geons, different geons have different non-accidental properties (not an artefact of viewing position but rather reflection of property of world)

Principle of componential recovery: if an objects geons can be determined then the object can be recognized or identified even if the object is partially obscured

Edge extraction -> detection of non-accidential properties / parsing of regions of concavity -> determination of components -> matching components to ____ representations

PRO: has well-defined components, can handle novel stimuli and variation, is economical

CON: geons not always reliably determined, may be too broad (objects also differ in their details, is viewpoint invariant (objects are most easily identified from a canonical (typical) viewpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the basis of the structural description approach?

A

Is different from image-based models

Image-based models: traditional models of visual perception focus on analyzing aspects of 2-D retinal image (junctions, features, etc.), rely on viewpoint-dependent frame of reference, it is difficult to represent a fully 3-D world

Structural-description models: structural description is a set of symbolic propositions about a particular configuration, these are different in the picture domain (2D) but are the same in the object domain (3D), relationships among components are important (eg. brick joined at midpoint to another brick)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does Stanley’s vision work? What is the current state of robotic vision in autonomous vehicles?

A

Stanley used environment sensors, positioning sensors and 6 Pentium M computers running Linux

Environment state consists of multiple maps that construct 2-D environment map

Stanley used environment sensors, positioning sensors and 6 Pentium M computers running Linux

Drivable area was determined by laser analysis projected into visual image

Extrapolation made to similar visual areas out of laser range -> vision initially classified grass as non drivable (green area) until lasers scan it and conclude grass is drivable then all grass areas in visual range reclassified as drivable (red area)

Data continually evaluated by a learning algorithm which can adapt to new terrain

Vision not used for steering control but for velocity control

NOW: Waymo autonomous vehicles, Tesla enhanced autopilot

Consumer Technologies: autonomous cruise control, automatic parking, lane departure warning, pre-collision breaking and throttle management

Current technology: no single sensory currently equals human visual perception, some sensors have capabilities that human drives do not (sensing through fog with radar), equaling or exceeding human sensing capabilities requires a variety of sensors whose data must be integrated to form a unified representation of roadway and environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly