you're going to group objects if they've been somehow defined graphically as belonging to one another - elements in the same defined region tend to be grouped together: within a border

we tend to group things that are a little closer together than other things - things that are near to each other are grouped together

Lecture 13 Flashcards by andrea shelton

Gestalt approach to object perception

The whole differs from the sum of its parts.

Principles of perceptual organization

How well did you know this?

Not at all

Perfectly

The whole differs from the sum of its parts.

looking at a limited set: what’s most likely given what i’m likely to see

– Perception is not built up from sensations, but is a result of
perceptual organization. [strong top-down influence]
– The mind (somehow) makes simple assumptions about objects in
order to recognize them in the environment. (didn’t know anything about the neuroscience involved)

How well did you know this?

Not at all

Perfectly

Principles of perceptual organization

8 principles for organizing objects within perceptual scenes have
been offered:

1) pragnatz (good figure/simplicity),
2) similarity,
3) good continuation,
4) proximity (nearness),
5) common region,
6) uniform connectedness,
7) common fate,
8) meaningfulness.

How well did you know this?

Not at all

Perfectly

pragnatz

the simplest interpretation: cognitively keeping your resources as limited as possible in trying to solve a problem: NOT overthinking

when you see a scene, assume it’s the simplest possible version of that scene

ex: the olympic rings

(good figure/simplicity)

• Every stimulus is seen as
simply as possible

• The easiest interpretation
takes fewer cognitive
resources.

wouldn’t try to break it up into all the possible objects it could be

How well did you know this?

Not at all

Perfectly

similarity

try to group objects by some feature that they have in common

• Similar things are grouped together

• Color is one measure of similarity, but it could
be shape, texture, orientation, etc. = feature that is easy to find

How well did you know this?

Not at all

Perfectly

good continuation

when you’re looking at lines within a scene, you’re going to try and keep things as smooth as possible

our visual system makes the assumption that things continue in nice, smooth curves; rather than break it off and continue in a different direction

• Connected points resulting in straight or smooth curves belong together.

– Lines are seen as following the smoothest path

– Holds true even complex images

How well did you know this?

Not at all

Perfectly

common region

you’re going to group objects if they’ve been somehow defined graphically as belonging to one another

elements in the same defined region tend to be grouped together: within a border

How well did you know this?

Not at all

Perfectly

proximity

we tend to group things that are a little closer together than other things

things that are near to each other are grouped together

How well did you know this?

Not at all

Perfectly

uniform connectedness

connected regions of visual stimuli are perceived as a single unit

How well did you know this?

Not at all

Perfectly

meaningfulness (familiarity)

looking at a complex scene, you’re going to intuitively form groups out of stimuli that seem to go together and form something that means something to you

Stimuli form groups if they appear familiar or meaningful

hidden faces: impose meaningful patterns

How well did you know this?

Not at all

Perfectly

common fate

as things move together we’re going to see them as an object

Things moving in same direction are grouped together

How well did you know this?

Not at all

Perfectly

do any principles override others?

some are really important (pragnatz) but they’re really heuristics - they give you better answers by narrowing down the answers

we kinda use them all together

How well did you know this?

Not at all

Perfectly

Gestalt theorists were also
interested in figure-ground
segregation –

how you pick out objects from a complex visual scene

what is the figure within a scene and what is the background: what rules can you apply to figure that out

How well did you know this?

Not at all

Perfectly

Properties of figure and ground:

– The figure is more “thinglike” and more memorable than ground.
– The figure is seen in front of the ground.

– The ground is more uniform (e.g. one color or texture) and extends behind the figure.

– The contour separating figure from ground belongs to the figure (border ownership).

How well did you know this?

Not at all

Perfectly

Factors that determine which area is

the figure:

– Elements located in the lower part of displays (bias to seeing things on the bottom as a figure)

– Units that are symmetrical

– Elements that are small tend to be more figure-like (slight bias)

– Units that are oriented vertically (slight bias)

– Elements that have meaning

How well did you know this?

Not at all

Perfectly

Gestalt principles operate as….

….as heuristics (probabilistic) that give us the means to quickly organize stimuli in the environment.

trying to describe how the visual system will quickly organize things

aren’t an algorithm, because the same input will give you a different output BUT they are trying to operationalize steps
- For our purposes, they operate at Marr’s 2nd level of analysis as they give us steps followed in the black box to yield a perceptual
result.

Gestalt principles give us methods by which the environment can
be organized, but don’t get far in solving the problems of
identifying occluded objects or seeing objects from different
viewpoints.

Recognition-by-components (RBC) theory tries to go further in
addressing these issues.

Under the Recognition-by-components (RBC) theory

trying to reduce some of the error: varying responses just by aplying heuristics

impose some organization on the environment by looking of specific features in the environment

objects are recognized by volumetric features called geons

– Theory proposes there are 36 geons that combine to make all 3-D objects. (basic visual building blocks)

– Geons include cylinders, rectangular solids, pyramids, etc.

basic toolkit to recombine the objects we expect to see: imagine the visual system has 36 types of legos and for anything you see you build up with those legos

Properties of geons – how they function

still Marr’s second level: its a model

view invariant properties
Non-accidental properties

– Discriminability

– Principle of componential recovery

– View-invariant properties -

ex: usually when you look at a rectangle from different views you can always see its edges

aspects of the object that remain visible from
(most) different viewpoints.

non accidental properties

don’t have to recognize it by looking at it by one way (looking at a cylinder from the bottom and only seeing a circle)

properties of edges in the retinal image that
correspond with the 3-D environment.

Discriminability -

shouldn’t be able to mistake one geon for another geon

the ability to distinguish geons from one another.

Principle of componential recovery -

to id an object you should be able to pick out its component geons that make up that object

the ability to recognize an object if
we can identify its geons
overcomes the problem of occlusion
bridge the gap between top and bottom processing?

a scene contains

background elements.

– objects organized in meaningful ways with each other and the background. (kind of organization that you’re used to seeing)

Difference between objects and scenes:

– A scene is acted within (setting in which you use tools) – An object is acted upon (tools, things you use)

Research on perceiving gists of scenes

- The gist is a quick understanding and recognition of major elements in a complex picture. – Mary Potter (1976) showed that people can do this very accurately when a picture is only presented for 250 ms. (important because our eyes will saccade around a visual scene about 3 times a second: eyes always looking for novelty: this behavioral data matches up with what we know about the visual system) – Li Fei-Fei (2007) extended this research to demonstrate the range of information that becomes available with more viewing time, extending from 27 – 500 ms. – The point is that our visual system needs time to construct complex images, but can do surprisingly well with a brief glimpse. - eye is trained to picking up that info very quickly – But how?

Mary Potter (1976)

verbal description ("there's a girl clapping") or an actual picture of the action then 16 other pictures in a row was there a girl clapping? YES if there was a presented image showed that people can do this very accurately when a picture is only presented for 250 ms. (important because our eyes will saccade around a visual scene about 3 times a second: eyes always looking for novelty: this behavioral data matches up with what we know about the visual system)

Li Fei-Fei (2007)

what's the critical time period? looked at time windows extended Mary's research to demonstrate the range of information that becomes available with more viewing time, extending from 27 – 500 ms. masked stimulus, present a stimulus and then a blanks screen to take away the memory of that image 27ms is faster than the time it takes for the signal to travel from the retina to V1 (35-40ms) by the time you get to 500ms you can recreate a scene: put a narrative to the scene