3D vision 2 Flashcards

Question 1

Q

what do we need to estimate in order to process depth and distance

Answer

A

depth order: closest vs furthest obejcts
depth intervals: how much further away from us is one object than another
absolute distance: how far away is an object from us
shape: what is the 3D shape of the object how does this vary in depth

Question 2

Q

how bad is the single vantage point

Answer

A

all information would be lost in a world of points

but this is not the case for a world of surfaces as we have other information available

Question 3

Q

how can occlusion tell us about depth order

Answer

A

surfaces are generally opaque
hence there is information in the covering of one surface by another
occlusion creates characteristic T-shaped intersections of contours (intersection of edges)
gives information about depth order rather than distance - relational rather than absolute

Question 4

Q

how can a bisection in linear perspective inform about depth interval

Answer

A

observer O views a picture containing points a and c

bisection in the picture plane is given by dot b

if a and c are images of points A and C on the grounded surface, bisection is P which corresponds to p

Question 5

Q

how can vergence on vertical disparity inform about absolute distance

Answer

A

vergence angle: rotation of the two eyes to fixate on a particular fixation point (larger for closer objects)

vertical disparities: the difference in vertical height of the sides of a projected object (projections onto the eyes distort)

because of the inverse square law, to interpret disparities they must be scaled using some estimate of viewing distance (disparity scaling)

thus disparity information on its own doesnt lead us to depth it has to be scaled relative to viewing distance

Question 6

Q

how can we get shape information from shading

Answer

A

surfaces often have even / homogeneous reflection properties

different amounts of light will be reflected according to the orientation of the surface with respect to the light source - shading

protruding/concave
hollow face illusion
prior experience of viewing faces can have an effect on our interpretation of shape

Question 7

Q

Why are linear perspective, shading and occlusion regarded as “cues” rather than sources of information?

Answer

A

implication of uncertainty and ambiguity, often reinforced by demonstrations of “apparent” uncertainty/ambiguity (e.g. Ames room).
Different sources of information have different reliabilities
On the contrary, the Ames room tells us that perspective is a very important source of information, overriding our “knowledge” about the size of familiar objects.
Therefore it is important to separate out whether there is information —
computational theory (a proper analysis of the properties of the optic array) — from whether we can use it.
e.g. Accommodation of the eye’s lens could potentially provide information about focus distance, but the human visual system is very poor at using it.

Question 8

Q

how well do we see depth in RDS

Answer

A

How much are two surfaces separated in depth until you are just able to see that one surface is in front of or behind the other
Surfaces with changes in depth
Measuring disparity thresholds: the amplitude of this corregation that is just required for you to be able to perceive the difference between the corrugations (low freq vs high freq)
(compare “decreasing correlation” from 3D vision 1)
To find a threshold correlation at which you can just see square in front or behind

By 8 cycles per image it is hard to resolve the corrugations

lowest thresholds (17 – 40 arcsec peak-to-trough disparity) ata frequency of depth modulation of 0.3 cycles/deg.iD

Question 9

Q

what do comparisons between disparity and parallax threshold functions show

Answer

A

both functions show maximum sensitivity around 0.3c/deg corrugation frequency
motion parallax thresholds are higher than disparity thresholds
same tuning - due to how we extract information from the stimulus
suggests we do have dedicated binocular disparity detectors

Question 10

Q

what can we learn from texture gradients (Gibson, 1950)

Answer

A

surfaces generally have a texture of smaller elements at a visible scale
because of Euclid’s law there will be a gradient of texture size in the image (similar triangles)
geometrically highly related to linear perspective
Works particularly well when on average the elements in the scene are at the same spatial scale and scaling of these elements in the retinal image depends on distance away from you

Question 11

Q

how can texture gradients provide shape information

Answer

A

Assumption: The texture elements are assumed to be black discs and the changes in their size and ellipticity are assumed to result from the shape of the surface on which they sit.
Interpreted as a particular 3D shape rather than ellipses changed in shape based on whether a 2D circle appears concave or convex (texture)

Question 12

Q

what information is there in texture gradients

Answer

A

foreshortening: ratio of width divided by length
how elliptical a texture element is
orientation and distance

assumes the pattern is homogeneous - texture element is the same everywhere on the surface (physical size)
the pattern is isotropic - texture element is the same in all directions (disc not ellipse)

Question 13

Q

does foreshortening always lead to the perception of a slanted surface?

Answer

A

a change in orientation i.e surface slant, causes foreshortening
if the field of view is reduced we lose information about the texture gradient
texture gradient seems to be one of the most imporntant things in giving us information about slant: foreshortening and changes in interspacing

Question 14

Q

what did Todd et al., (2005) demonstrate about the effect field of view on ability to extract information about plaid and blob stimuli

Answer

A

Stimuli either convex or concave and people were asked to make disctiminatiosn between these or indicate their perceived slant of the surface
Measured perceptual gain as a function of field of view
Blobs - Lower accuracy for small field of view than plaids
For blob patterns, information from spatially separated texture elements must be interpolated
Plaids - higher accuracy for small field of view than blobs
For plaid patterns there is a smooth and continuous change - linear perspective

Question 15

Q

what do we know about multiple cues

Answer

A

Often there are multiple cues in the scene
In psychophysical experiments we often isolate individual cues to understand their role in a task
Kraft & Brainard (1999)
Important to think about how the visual system deals with multiple sources of information as not all are available at the same time and they may have conflicting information
But the multiple cues in the environment may each have its own advantages and disadvantages.
The magnitudes of different cues vary in a particular way, relative to the change in the stimulus and they are correlated with one another
As they rely on similar underlying processes
Can think about when their relevance changes to assess which are used when

Question 16

Q

what is perceptual gain

Answer

A

perceived slant/ true slant

Question 17

Q

how are cues combined

Answer

A

Cues are assigned different weights depending on the context and the task. The weights assigned can
vary between stimulus conditions and between observers
Models, such as Bayesian approaches and perceptual learning, of cue combination assume that the
visual system computes the most probable percept. The weights assigned in these models take account
of the reliability of each cue and the likelihood of alternative percepts

imprecise or incomplete cues:
• Occlusion provides information on depth order but not depth interval.
• Texture gradients are only reliable for evenly textured surfaces.
• Horizontal disparities do not reveal absolute distance.

Question 18

Q

what is the optimal combination of disparity and texture cues to slant

Answer

A

complex changes in weight given to the cues depending on slant and viewing distance

reliability of texture: increases with increasing slant

reliability of binocular disparity: decrease as viewing distance is increased (inverse square)

Question 19

Q

what do we know about the reliability of individual cues

Answer

A

we can predict weights for different slants and distances

weight is inversely proportional to the just noticeable difference in slant
if the cue allows fine discrimination between slant angles it is more reliable and will be given a higher weight

the reliability of texture should increase with increasing slant

the reliability of binocular disparity should decrease as viewing distance is increased

the weights must sum to one so the texture weight appears to increase overall with distance but this is just because the disparity weight decreases

Question 20

Q

how are cues combined

Answer

A

when disparity and texture cues don’t agree, the relative weighting of the cues will lead to an estimate that is between the two

cue 1 is more reliable (higher peak likelihood, smaller spread) than cue 2 ans so more weight is given to it

when both cues are present the estimate is closer to the more reliable cues

Question 21

Q

what predictions can we make from the reliability of individual cues

Answer

A

Using the weights for each cue alone, Hillis et al. (2004) then predicted the participant’s just noticeable differences when both texture and disparity cues are present.
They match well to the the responses collected from the participant

Question 22

Q

summary

Answer

A

the visual system processes depth information to determine

depth order
depth interval
absolute depth
3D shape

while there are many sources of information available to solve the problem, the effectiveness with which we use them varies
when there are multiple cues available, each cue is weighted by its reliability and this is dependent on the stimulus conditions and the task
these weighted cues are combined in a flexible way to make a perceptual judgement