Seeing in 3D Flashcards
What is 3D vision?
When a 3D object or scene is projected onto the 2D retina, information about the third dimension is lost.
3D objects with very different shapes will give rise to exactly the same mental retinal image.
There is insufficient information in the retinal image alone to reconstruct the shape of the object
An infinite number of 3D shapes could give rise to a given 2D projection.
Can only be disambiguated by knowing the distance and depth or relative distance of their features.
Concept originates with Berkeley and the veil of perception whereby the perception of an object and the object itself are two distinct entities and all we can be certain of is the existence of the perception.
What are direct cues for depth and distance?
Sources of information for which there is a direct 1:1 correspondence between physical parameters and physiological signals.
accommodation, vergence eye movements and binocular disparity.
What are pictorial cues for depth and distance?
Sources of information about depth and distance for which there is no direct correspondence between physical parameters and physiological signals.
Instead they derive from out ability to learn about the relationships between distance and other cues in the environment.
Pictorial because they’re the cues used by artists to depict depth and distance in paintings.
What is accommodation?
A direct cue
The process of adjusting the thickness of the lens of the eye in order to bring something into focus.
When the ciliary muscles are completely relaxed the lens is flat and the eye is focused at infinity.
When the ciliary muscles contract, the lens becomes fatter and the focal length decreases so that the limit of our near vision the lens is as fat as it could be.
Theoretically the effort produced by the cilliary muscles to bring an object into focus or the value of the nervous signal causing it to contract could be read out to provide an index to the distance to the plane of focus.
This is not a useful source of information because the change in lens shape is only significant for distances about 2m.
The readout might specify distance to the point of focus but it provides little information about depth - the relative distance between two points.
Points in front or behind the plane of focus are burred and the blur contains no useful information about whether the point is nearer or further than the plane of focus or by how much.
Recovering 3D shape from accommodation would entail keeping track of multiple readings from successive fixations over time.
A slow process that would only be possible over short distances
What are vergence eye movements?
Vergence is the angle formed between the optic axis of the two eyes when we fixate on a particular point in the scene.
The nearer the point, the more eyes turn inwards or converge to fixate it.
Berkeley discounted this as a viable source of information about distance however it is useful for distances up to 6m.
Vergence codes for distance not depth therefore depth can only be inferred from multiple fixations.
What is binocular disparity?
The eyes are set apart by 62-65mm on average.
As a result they have different view points.
When we fixate at a point in a scene they receive slightly different images.
Depending on the position of an object in one’s field of vision, it can form an image at different distances away from the fovea of each eye - disparity.
Disparity is best measured in terms of visual angle.
If the raise from the object projected into the right eye from an angle ß and to the left eye an angle ∂, then the disparity is given by the difference between the two angles ß-∂.
Disparity is proportional to depth, the distance between the eyes and image planes of objects.
Binocular disparity could be very useful for assessing 3D shape because the depth of all points in the image, relative to the fovea, could in theory be measured simultaneously.
Also depends on obtaining an independent reading of the distance.
What did Wheatstonee’s (1836) stereoscope study demonstrate?
Demonstrated that binocular disparity gives rise to the perception of depth
Observer sits in front of a pair of mirrors angled so that each eye sees an image of a scene/object, drawn/photographed as if seen from the viewpoint corresponding to that eye.
The disparities in the two drawings correspond to the disparities that would have resulted from the object being projected onto the retina.
The observer fuses the two images so that they are seen as one.
The disparities are interpreted as if the observer were looking at the original 3D scene - what the observer sees
What do stereoanaglyph spectacles tell us about 3D vision?
Red and green filter over each eye
Red filter should only allow red light to pass while a green filter should only allow green light to pass
If we draw an object a scene from 2 viewpoints, 65 mm apart one in red and one in green and superimpose them and then view them through the anaglyph spectacles each eye will only see the drawing drawn in the colour of its corresponding filter.
The right eye covered with the red filter will only see the red drawing, the left eye covered with the green filter will only see the green drawing.
The visual system can detect the disparities between the two images in the same way as in the Wheatstone stereoscope and then interprets them as depth.
The images will stand out in 3D.
Wireframe model of a cube. If you look at it through one eye, all disparities are removed and it appears as the standard net cube illusion.
Its 3D shape is ambiguous and the percept may flip back and forth between a configuration with the upper face out and one with the lower face out.
When observed through both eye with the anaglyph spectacles it adopts a stable upper face out configuration - the shape depicted by the disparities between the red and green images.
More elaborate images can be used where the surface of an image varies continuously with depth
Apex of a cone flipping between being nearer and further away
What can random dot stereograms resolve?
Can be argued that the perception of depth from binocular disparity is a prerequisite for 3D shape perception OR that depth perception depends on first having identified shapes.
Images that contain disparity information but not detectable shape information
Julesz (1960) took a pair of identical images consisting of randomly assigned black and white pixels and then displaces a patch of pixels corresponding to shape by a small amount corresponding to some disparity.
The small gap that was left was filled with random values.
When viewed through a stereoscope the visual system fuses or aligns the pixels in the background of the two images and then detects that there is a disparity between the pixels that are in the foreground part of the image relative to the background and interprets their depth as being different.
Does this happen in normal vision and is it sufficient to allow the visual system to work out the shape of the patch
If this experiment is carried out using red and green dots and anaglyph glasses a square can be seen to stand out in the foreground against the background. When one eye is covered the square disappears because there is no information about the shape in each of the red and green images alone
The only information is in the disparity between the two images which we only detect with both eyes open.
The shape is inferred from the disparity.
The further away the greater the perceived depth which demonstrates the fact that disparity has to be scaled for viewing distance.
Evaluate binocular disparity for 3D vision?
Binocular disparity is a useful source of information about depth but not distance
Estimates of disparity thresholds vary according to how they are measured
Using the Howard Dollman peg test which involves two pegs placed at different positions and distances in the visual field - the minimum detectable disparity is 0.5 minutes of arc, which is comparable to the diameter of a cone photoreceptor.
So sensitive that stereoacuity, the ability to detect binocular disparity, is referred to as hyperacuity.
In real life this corresponds to a depth of 8 cm at a viewing distance of 6m
Stereopsis can precede shape perception but it does not necessarily do so.
Disparity must be scaled by viewing distance to be correctly interpreted
VIP seats in middle are optimum viewing distance for 3D movies
Viewing distance is necessary for interpreting depth and is measured sufficiently for close distances by accommodation and vergence eye movements but not further away.
What is occlusion.
Pictorial cues - interposition (occlusion)
E.g some cards seem to be placed further away than others - this is inferred from teh fact that the nearer cards occlude those behind them
When we look at the set up we see that it is different.
The cards that seem to be at the front are actually the furthest away and vice versa
But they have corners cut away so as to allow the further away cards to be seen
This allows the visual system to assume that the further away cards are occluding the nearer ones
Even when we know the layout of the scene we are unable to override the illusion that the smallest card is the nearest one
Knowledge does not override perception .’. the illusion is cognitively impenetrable
Perhaps because occlusion is so common in the natural environment that its safest for the visual system to assume always that if it looks like occlusion it is.
What is shape and shading?
Pictorial cues - shape and shading
Visual system tends to assume that light is directional and that brighter patches then correspond to directly illuminated surfaces
The darker patches correspond to surfaces in the shade
So the best shape compatible with this can change depending on the orientation of a picture - can appear to be a dome or crater
Visual system assumes that light comes from above
Shadows can be used to convey an impression of depth
The less a shadow is occluded by its corresponding object, the higher it seems above the background
The visual system assumes that the occluded patch lies on the surface the background and that less occlusion implies less distance between the object and its simulated shadow
Compatible with properties of surfaces and light sources that the visual system may have learnt through experience
What is aerial perspective?
Aerial perspective is the tendency for things to seem less distinct and bluer the further away they are due to moisture and pollution in the atmosphere
Perception of size is scaled by perception of distance the same actual image size will produce a larger or smaller apparent retinal image size depending on their perceived distance/ where they are place in an image.
The further away it seems, the larger it must be
What are textural gradients?
Texture gradients - recede with distance
Perception of size is scaled by perception of distance the same actual image size will produce a larger or smaller apparent retinal image size depending on their perceived distance/ where they are place in an image.
The further away it seems, the larger it must be
What is height relative to the horizon
- the higher in the image and the closer to the horizon, the further away the object.
Perception of size is scaled by perception of distance the same actual image size will produce a larger or smaller apparent retinal image size depending on their perceived distance/ where they are place in an image.
The further away it seems, the larger it must be