Lec 10- Flashcards
Gestalt psychologists said what
Full primary sketch uses waht
The whole is greater than the sum of the parts. Though individual lines may be different they all group together to produce long horizontal bar. Not just features in isolation.
Full primal sketch Marr used gestalt laws of perceptional organisation and larger structures boundaries and regions made explicit using gestalt like grouping rules of the primitives in the raw primal sketch.
Full primal sketch some info
What does it provide a list of, goal of full primal sketch and how is the task of recognition made esaier
Grouping few lines into blob cell reveals what. Where are these properties present.
New symbols and higher level symbol. Old lower level symbols.
The Raw Primal Sketch provides a list of symbols like edges and line terminations. The goal of the Full Primal Sketch is to group these primitive elements into more meaningful clusters: the resulting compound descriptive elements have new properties, which may make the task of recognition much easier.
For example, ‘spots’ have properties such as position and size. A line of spots can be described in terms of their individual positions but, if the spots are grouped explicitly into a new line symbol, the symbol has the new explicit properties of length, width and orientation
Similarly, grouping a few lines into a new blob symbol reveals new properties such as shape. In each case, the new properties are actually present in the low-level elements, but the process of forming the higher level symbol makes these properties explicit. The new symbols are thus much more useful for further processing. They do not completely replace the old lower-level symbols but just provide them with a new ‘handle’: the original descriptions of the low-level elements are still available if needed.
The processes of visual grouping were studied extensively by the Gestalt psychologists , and one interpretation of their famous dictum “the whole is greater than the sum of its parts” is in terms of the higher order descriptions that emerge when primitive descriptive elements are combined into higher-order symbols.
The Gestalt psychologists formulated a set of grouping rules to describe typical human perceptions.
(All individual lines diff but group together to produce long horizontal bar so grouping= larger structure,. Full primal sketch used gestalt laws of perceptional organisation.
Grouping rules
Proximity
Proximity- groups into rows rather than columns as nearest ones are to right not underneath. Reverse is true vertical if vertical; closer.
Similarity- can override proximity here though vertical gaps bigger than horizontal gaps we should see it in rows and we would if they weren’t coloured by if they are then we see similar objects and the figure is organised into columns of similar elements eg brightness. But proximity can sometimes override similarity.
Closure- closure of elements now. Symmetry is important for this. Baso outlines of objects you see the bit in the space the gap of the outline so closure of elements top part with bottom part for one complete figure for hidden message. Not individual parts.
Closure can also make us see contours that arent actually there bc our brains are trying to make closure so illusory contours as brain fills gaps. Whole greater than the parts so we see panda for example eventhough white is same we see that illusory contour as brain fills space between his ears
Good continuation
Symmetry
Continuing on so we see them together.
Symmetry- makes a greater whole but we do see both individually as well. So binds the two parts together but in a greater whole. When w break that symmetry now its like two unconnected objects so its important in binding things together. Clearly defined axis of symmetry= stronger orientation.
A modal completion
Not exactly grouping rule. When parts of images we cant see are completed, we suppose this character has legs when cut off but we dont see it so binds things that are visual are there with what is implied to be there sp groups together in that way.
Works in harmony with good continuation as we see two sticks crossing behind tree trunk and we amoddally complete missing part of image behind the tree. May be wrong
Good continuation= simpler overall image
Common fate
The tendency to group things moving in the same way
If staying still cant see it as well if moving together probably have a common object.
Segments objects from scattered distinguishing background so when moving helps group together so easier to see but stationary cant eg moth. So prey= stationary when camouflaged so not distinguished from background.
The marroquin pattern
Spontaneous organisation and recognisation.. your brains trying to group things so it switches between things as it cant fixate on a set structure so lots of competing structures with different grouping rules. It cant settle on a solution so it cycles around all the solutions.
This shows these rules are always operating. Grouping by similarity allows us to do this
Figure ground segregation and competition between grouping rules
An important stage that follows grouping is the assignment of figure and ground. By ‘figure’, we usually mean, object, though it could be a collection of objects (e.g. a pile of trash, where the objects are grouped together) or a flat 2D image (e.g. a painting), where the content is enclosed by the frame. By ‘ground’, we mean the space between the objects, sometimes called background.
In the main, the visual system is very good at this—look around you and consider the various challenges that face the visual system in performing this task. On occasions though, the decision can be ambiguous, as in Rubin’s vase in Fig 6. This is another example of an ambiguous figure—is it a vase or two faces? Note that figure and ground are being assigned differently in each case. This task of border ownership (whether the boundary is owned by a white region or a black region) is thought to be performed by cells in V2 (Zhou et al, 2000). So is the boundary owned border ownership by the white teh vase or the faces the black…. V2 cells
Competition between grouping rules
Closure proximity work together now if we add them together and close them off then closure wins.
Proximity vs similarity rows and columns if pull them apart proximity wins but sometimes similary wins. Cant put them in hierarchy
From the raw primal sketch to full primal sketch
Marr’s technique for getting from the Raw Primal Sketch to the Full Primal Sketch is really just an implementation of these simple grouping principles.
But whereas the Gestalt psychologists simply described grouping rules, Marr provides some (admittedly rather vague) theoretical justification by pointing out that image elements that share any physical similarity (e.g. scale, brightness, orientation) are likely to arise from some common cause in the external world (e.g. a group of points moving together are really very likely to belong to the same object). In other words, Marr linked the grouping rules, which apply to the proximal stimulus, back to the distal stimulus.
Pragnanz
General principle tp all grouping rules
Law of good figure= simplicity.
Simplciity= compared with loghitness contrast and orientation.
Texture
Simple images dont imply pictorial relief just lines on screen but it has visual texture. All similar elements group by similarity to form 2 different textures= texture boundary between them. Another way for figure grounds egregation.
Texture is a key aspect that emerges from grouping in visual perception. Grouping line elements by orientation can quickly reveal distinct regions, a process known as “preattentive” segmentation, indicating rapid visual processing. Texture descriptions aid in recognition and depth perception, and texture boundaries, highlighted in the Full Primal Sketch, help identify object boundaries not visible in the Raw Primal Sketch.
Experiments with texture patterns help explore human vision’s grouping processes. The ability or inability to segment textures may reflect natural properties. These experiments prompt us to consider the brain’s wiring and the role of filtering processes by simple and complex cells in texture segmentation.
Criticisms
Visual search we see odd item pop out= pop out phenomenon and rest segment away so takes longer= conjunction search.
We can do experiments and manipulate no of distractors which is independent variable and measure reaction time but still it doesnt slow down, greater no of distractors we have to search more.
Grouping and parallel search can be driven by depth info so can get back down from 2 and 1/2d sketch to primal sketch either way strict feed forward arrangement with each stage solving different tasks cannot be true
Criticisms once again
In Marr’s scheme, grouping takes place in the primal sketch based on 2D retinal coordinates, Fig 7, the odd-item out is defined according to its three dimensional geometry, implying a deeper level for grouping than is often supposed.
Note that the number and orientation of lines [i.e. the primitives in the raw primal sketch] is identical for all of the items in Fig 7, so the grouping that allows the odd-item out to be recognised has access to information of an order higher than a simple edge description.
What may the grouping rules be compared to Mars scheme
When will pop out occur
In the visual search paradigm
Grouping rules may be deeper than supposed in Mars scheme bc stimuli can be created in which perception of depth plays an important contribution to grouping.
Pop out will occur so long as basic features for the target are arranged differently from the distractors.
Visual search- if time taken to search for a target does not depend on number of distractors in the display the graph of results will be a flat line as (doesnt increase or decrease visual search time based on searching for target as it doesnt depend on distractors so same reaction time)
Is pragnanz associated w gestalt psychologists
What about 2 and 1/2D sketch is true
Pragnanz is not associated with gestalt psychologists.
2/12/d d slektch- the types of things made explicit at this representation level would be use for navigation as depth cues so surface properties can be valuable so judging distances and avoiding obstacles etc. 2 and 1/2 d sketch describes only the visible parts of the scene.
Correct perception of the size of unfamiliar object requires
That the visual scene contains reliable depth cues.
If image features dont group
They segment
Does grouping involve top down and bottom up processing
Does grouping involve top down and bottom up processing- yes
Mars 2-1/2 d sketch. Wat did we see in. What’s the problem
Marr’s 2-1/2D Sketch describes surface properties of the image (with no regard for the objects for which those surfaces are part of) and includes a depth map of the 3D layout of the world from the current viewpoint (i.e. it is viewer centred).
Even though images are flat there are many ways to recover depth information. Collectively, these are called depth cues (some of which are shown in Fig 1) and include such things as convergence and stereopsis from binocular vision. However, as shown in Fig 2, even with only the type of information potentially available from a single (i.e. monocular), black and white Full Primal Sketch, we can make good sense of depth using static pictorial depth cues
Depth cues what is it
In vision, a cue is something that allows us to make inferences about the stimulus. Thus, in the case of visual depth cues, certain aspects of the retinal image allow inferences to be drawn about (relative) depth and the order of objects in the 3D world.
Computational theories of vision are really just a formal elaboration of the properties of the most useful cues.
Consider a cube info in the 2-1/2 d sketch
Retinal image is flat but works is 3d
What 2 categories do depth cues fall into
2 types of visual cues
Occluding contours- segments object from background
Surface orientation discontinuities- made explicit at this stage,
Visible surfaces- only seen and behind it not explicit
Surface orientations so slant and tilt
Distance of each surface from observer. Estimate of distance.
3rd dimension is embedded in 2d retinal image.
Visual and oculomotor which splits into convergence and accom. Convergence angle can tell us how far away the object s as our eyes swing in together as objects come near us. Accomodation again same thing if we can get feedback from states of the lens.
Oculomotor- one surface or object at a time and only useful over short distances and minor source of depth info. So fairly minor.
Visual cues are more valuable.
Monocular and binocular. Monocular contains visual cues and cue doesn’t require comparison so closing one eye monocular cues to depths till available
Static pictorial cues
Monocular. Dont involve motion. These determine how stationary 3d objects appear when projected onto flat 2d image. Major category of depth cue.
Elevation
Pictorial cues include elevation (height) in the visual
For things above the horizon, more distant objects appear progressively lower in the image (this is not really apparent in Fig 2, but you could ‘sketch-in’ a couple of flying swallows to get the idea!).
Thus, vertical position in the image can provide an important cue to depth. Note, however, that just because an object is higher in the image and below the horizon, it does not mean that it has to be further away, just that it probably is. This is the essence of a cue—typically, it is something that generally points us in the right direction.
linear perspective
One of the most obvious pictorial cues to depth is that of linear perspective: parallel lines in the 3D world that recede in depth project to converging lines in the 2D image (see Figs 4). Train tracks.
Converging image lines are thus a potential cue to depth. Notice, too, that they can provide important cues to 3D boundaries. Indeed, although the central line in Fig 4 is in fact straight, one can almost ‘feel’ it kink halfway up the figure.