Lec 6/ TB Ch 6&8b Flashcards
- Retinas are 2D or 3D
- 2 things if requires
Take home msg
- Our retinas are 2D surfaces
- that require eye movements
- (e.m. help but also complicate things)
- How to work around: oculomotor control, spatial constancy
- that require 3D info to be recovered from flattened and distorted images
- So we are dealing with Plato’s shadows in the cave
- This disadvantage can be an advantage
- Linear and Arial perspective, Binocular disparity, Horopter
- that require eye movements
Eye movements
- 6 muscles on on each eye
- Purpose of 3 pairs of muscle on each eye
- Left eye: how to move left vs right
- 3 cranial nerves that control the eyes
- # ?
- Where does it start?
- What does it innervate?
- Which system control thee nerves ?
- Superior colliculus
- location
- role
- is it part of the visual pathway?
- Inferior colliculus fx
- Cerebral cortex fx
- location
*
Eye movements
- Eye movements: six muscles are attached to each eye and are arranged in three pairs:
- 4 rectus muscles: Inferior/superior/lateral/medial rectus
- 2 oblique muscles: Inferior/superior oblique
- Lateral rectus = horizontal
- Medial rectus = b/w eyes and nose
- Superior rectus = top of the eye
- Inferior rectus = bottom
- Superior oblique = the top hook
- Inferior oblique = bottom hook
- It makes sense to have 3 pairs (2 for each dimension): it allows us to move horizontal, vertical, and torsional (rotating) eye movements
- X
- Left eye
- To look to the left, the left eye will contract the lateral rectus muscle
- To look right, the eye will contract the medial rectus muscle
- X
- Eye muscles are controlled by 3 cranial nerves
- Cranial nerve III: aka oculomotor nerve
- It starts in the oculomotor nucleus
- It innervates all the muscles except for 2
- Cranial nerve IV: trochlear nerve
- It starts in trochlear nucleus; (trochlear = the hook on superior oblique muscle)
- It innervates only the superior obliques muscle
- Cranial nerve VI: abducens nerve
- Starts in the abducens nucleus
- It innervates the lateral rectus muscle
- Cranial nerve III: aka oculomotor nerve
- Cranial nerves start in the brainstem and are controlled by several other nuclei for horizontal and vertical eye movements
- Superior colliculus: Structure in midbrain that plays important role in initiating and guiding eye movements
- the 3 cranial is controlled by the superior colliculus
- Not part of visual pathway
- Have neurons for motor control for eyes and heads
- It receives its own visual input and immediately converts it into eye movement
- Inferior colliculus = hearing
- Cerebral cortex: eye fields for motor control, controls the superior colliculus; Frontal & parietal (etc.) eye fields
- Located at inferior parietal sulcus
Eye movements cont
- 3 step pathway
- 6 types of eye movement
- x
- smooth pursuit: fx
- Saccades fx
- speed
- Vergence movements fx
- when are these seen?
- Stereovision
Eye movements cont
Ares project to SC -> brain stem -> eye movements
6 types of eye movements (focus on 1st 4)
- Smooth pursuit: Eyes move smoothly to follow moving object
- Saccade: Rapid movement of eyes that change fixation from one object or location to another
- Vergence eye movements: Type of eye movement in which two eyes move in opposite directions
- Fixational eye movements, microsaccades: when you don’e move your eyes, your eyes will have these little jerking eye movements
- 2 more to keep the retinal image stable during (self-)motion -> won’t be elab
Smooth pursuit: When smith is moving, we keep the object of interest stable and on the fovea; this is b/c he brain perceive this as eye movement (ex. pencil)
Saccades
- Function of saccadic eye movements: move (rotate) fovea to object of interest quickly o reduce travel time during which vision is blurred.
- It moves 700 visual degrees/sec, and vision is blurred
- Yarbus (1967): scan paths reveal intentions and interests.
- Thin lines = saccades
- Knobs = fixation
- • 3-4 saccades/sec when u are awake (v fast)
Vergence movements - Function of vergence movements: looking at objects in depth so that retinal images are overlapping
- Seen in Converging (when objects are close)/ diverging movements
- Ex. you look at the tree in front of you (converge) -> images perfectly overlap
- Ex. you look at mountains far away (diverge) -> double vision
- Stereovision: images are overlapping
- Done deliberately; you can control this
- Eye movement cont
-
How do we achieve spatial constancy?
- False motion & tap eyeball example
- Explanation
- Corollary discharge
- x
- spatial constancy
- 2 purposes
- Compensation theory define
- 3 steps
Eye movement cont
How do we achieve spatial constancy?
- When we look at something, the doesn’t world “jump around”
- false motion!
- Demo: when you close one eye, and gently tap on the other eye, it seems like the world is shaking
- When we tap on the eye, we move our eye and move the retinal image
- It is not moving the normal way (6 eye muscles moving it)
- corollary discharge (see later) tells you whether the eye muscles are contracting or not
- When you tap your eye, the eye muscles are not doing anything and the retinal image is moving
- So, that must be the world is shaking
- This is related to spatial constancy
- Spatial constancy: the ability to perceive the world as stable** and **continuous despite eye movements.
- Enables us to discriminate motion across the retina that is due to eye movements vs. object movements
- Enables us to tell where things are (ex location of yellow heart, even I look directly at it)
- x
- How do we perceive the world as stable?
- Compensation theory: Perceptual system receives information about the eye movement and discounts changes in retinal image that result from it
- # 1 Occulomotor system sends motor command to eye muscles
- # 2 A copy of that command (“efference copy”or ”corollary discharge”) goes to an area of visual system that has been dubbed “comparator”
- # 3 Comparator compensates for image changes caused by the eye movement, inhibiting any attempts by other parts of the visual system to interpret changes as object motion
Eye movement cont
- The comparator
- 2 general steps
- Case A: 1 eye follows a moving pencil - 2 steps
- B: a pencil closer to you, a finger farther from you
- The pencil is moving, finger is stationary
- If your eye follows that pencil; The retinal image of the pencil is stable; the retinal image of the finger is not (moving)
- → 2 steps
- Issue: w/ compensator
- motion sickness
- Compensation theory & Bayesian inference connection
Eye movement cont
The comparator
- # 1: eye follows the moving pencil/ you look at a dot, and a pencil moves within your receptive field
- Image movement signal (Blue): send signals on whether you detect motion in the world or not
- Motor signal: Brain structure send signals to the eye to control ocular movement
- efferent signal/corollary discharge signal: Eyes send signals back to the brain on whether the eye was moving or not
- # 2: The corollary discharge signal and image movement signal both enter the comparator
- The comparator determines if there is motion in the world or not
- x
- Ex A: #1 eye follows a moving pencil
- Corollary discharge signal: the eye is moving
- Image movement signal: there’s no motion on the retina
- # 2: Comparator: since the retinal image is stable but the eye is moving, there’s motion outside, in the world
- x
- Ex B: a pencil closer to you, a finger farther from you
- The pencil closer to you is moving
- The finger farther away is stationary
- If your eye follows that pencil
- The retinal image of the pencil is stable; the retinal image of the finger is not (moving)
- 1 signals
- Image movement signal: The retinal image of the finger signals that there is motion on the retina
- Corollary discharge signal: tells the comparator there’s also eye movement
- 2 The comparator concludes that the finger is stationary (no motion)
- X
- But: compensation (from compensator) wouldn’t be precise enough.
- Discounting motion caused by movement is never precise enough (v difficult to be precise)
- When you use your eyes, it will misperceive a little bit of motion b/c the compensator will make little mistakes -> this amounts to lots of false motion perceived even though there is compensation
- This causes motion sickness: Our balance system tells us the world is not moving; but the visual system tells us the world is moving
- Since compensation is not precise enough, and these mistakes have unpleasant cons (ex motion sickness)
- Prof study showed that compensation theory follows Bayesian inference (e.g., Niemeier et al., 2003)
- The brain achieves spatial constancy because it assumes a priori that the world is not moving
- IOW: Small movements in the world that coincide with saccades are ignored
- IOW: the movements are so small, and the system discards it
- Thus, p(S) = prior probability = we assume the world as stable
How do we perceive the world as continuous?
- Why don’t we perceive smears when things move fast?
- Saccadic suppression
- Graph
- y-axis
- x-axis
- meaning
- How do we perceive the world as continuous?
- When we look at heart then circle pictures very quickly, we don’t see a smear
- When you move a camera very fast while taking a picture, you see a smear
- Why don’t we notice retinal smear during saccades? -> we have saccadic suppression
- Saccadic suppression (of vision, incl. motion): Reduction of visual sensitivity that occurs when one makes a saccadic eye movement; eliminates smear from retinal image motion during an eye movement
- IOW: when we are about to make an eye movement, our visual system shut down visual sensitivity
- Y-axis: d’, if it is over 3 = really good at the task; 0 = shit, no motion perception
- X-axis: when you perceive motion
- 0 = when do you perceive motion
- Motion perception sensitivity drop before the saccade starts, to prep for not seeing smears
- IOW: visual system ignores vision during saccadic phase
- X
- We have short periods of blindness (“grey-out”) when we make a saccade
- Distorted time perception around the time of saccades
- Still dunno mechanism
- But saccadic suppression help us perceive the world as continuous
Space perception and binocular vision
- euclidian geometry
- // lines
- assumption
- Which sense (among our 5 senses) is governed by Euclidian geometry?
- Does this apply for vision?
- What problem is there when we received 3D info from 2D projections
- object size
- What does our visual system
- Parallax
- aka?
- Example of horizontal parallax (fingers)
- What is parallax helpful for?
- x
- How do we perceive depth?
- what process is involved?
- Monocular depth cues vs. Binocular depth cues
- Binocular depth cues provide 3 things
Intro to space perception
- Euclidian geometry: Parallel lines remain parallel as they are extended in space
- Objects maintain the same size and shape as they move around in space
- Which sense is governed by Euclidian geometry?
- Touch
- Ex no matter how far the cup is, it’s still the same size
- But Euclidian geometry does not apply for vision
- x
- Problem for vision: recover 3D info from 2D projections
- When the retina image is 2D -> distortions
- Ex, object close by is way bigger than object far away (this not true in reality)
- IOW: It looks Euclidian, but it is not the case
- However, most depth cues can be derived from geometrical consequences (Euclidian) of the projection
- IOW: we can reconstruct 3D reality with these distortions
- Parallax: The two retinal images of a three-dimensional world are not the same
- Ex: you hold R finger in front of your nose, L finger far away from your nose
- When you only look via the R eye, the hands overlap; when you look w/ L eye only, your hands are separate
- -> horizontal parallax
- Parallax helps w/ stereovision
- Binocular disparity (aka parallax): The diff between the two retinal images of the same scene. It is the basis of stereopsis; a vivid perception of the 3D of the world that is not available with monocular vision.
- Disparity in the horizontal dimension
Depth perception
- Our retinas are 2D projection surfaces.
- The brain creates a 3D image from the projections.
- Ex. via free fusion by converging your eyes
Depth perception w/ 1 eye vs 2 eyes
- Monocular depth cues vs. Binocular depth cues = One eye sufficient vs. two eyes necessary
- Binocular depth cues (from overlapping visual fields) provide:
- 1: Convergence
- 2: Stereopsis: see the same object w/ slightly different vantage point
- Pigeon has eyes at the side of its head -> little binocular vision; but it can see behind itself for predators
- Owl: have eyes in front of skull -> have binocular depth cues
- 3: Ability of two eyes to see more of an object than one eye
- 1 eyes sees more on the left; other eye sees more of the right eye
Space perception and binocular vision cont
- 7 Monocular Cues to Three-Dimensional Space
- definition
- example
- nonmetrical depth cue vs metric
- occlusion
- familiar size
- 3 main cues in bunny picture; what is missing in the 2nd one?
- Euclid’s remoteness theorem & relative height
- Why is there more depth in the rotated image?
- Familiar size
- woman’s hand and head
- books in the painting
- penny and the toy car
Monocular Cues to Three-Dimensional Space
- – Occlusion
- – Relative size
- – Position cues
- – Familiar size
- – Aerial perspective
- – Linear perspective
- – Motion cues
- x
- Occlusion
- Occlusion: A cue to relative depth order when, for example, one object obstructs the view of part of another object
- T-junctions tells us there is occlusion
- Nonmetrical depth cue: provides info about depth order but not magnitude.
- (Metrical depth cues: Provide quantitative information about distance)
- X
- Size and position cues:
- Relative Size: we can compare the size between items without knowing the absolute size of either one
- We can tell the right balls are smaller, but we don’t know how small in terms of magnitudes
- Flowers closer to use = bigger; further down = smaller -> creates a sense of depth
- x
- Texture Gradient: A depth cue based on the geometric fact that items of the same size form smaller images when they are farther away
- Ex. squares closer to us appear bigger; vv
- Ex. bubbles are circle, some are oval, and they are oriented in different directions
- These distortions in textures together they form depth and waves
- x
- Relative Height: Objects at different distances from the viewer on the ground plane (where we stand on) will form images at different heights in the retinal image
- Diff heights on the ground plan can create depth
- objects further away are seen higher in the visual field
- Ex. bubbles on the top are further away b/c it is located higher in our visual field
- Ex. relative height: bunny at the bottom = closer; bunny at the top = further away
- Ex. texture gradient: these geometric images (all bunnies); smaller = farther away
- Ex. Relative size: bunnies closer to us = bigger; bunnies farther away = smaller
- There’s texture gradient and relative size (bunnies on the right might be further away as they are smaller)
- But the depth cue is not as convincing
- This is missing relative height
- When we look at a plane, objects that are smaller are usually higher on the visual field (not on the right)
- Relative height
- Tells us about depth
- Euclid’s remoteness theorem: The more remote parts in planes situated below the eye, appear higher (the projection EF of BC appears higher than the projection DE of AB).
- Ex bunnies are on B and C -> projected higher on our visual field as E and F
- Natural scene statistics. (we live in a world w/ gravity)
- Which one looks deeper?
- When I rotate the image by 180 degrees, the rotated image still has depth cues (texture gradient and relative size); but
- Here the object that is further down here = objects that are further away
- We see less depth here
- x
- Familiar size: depth cue based on knowledge of the typical size of objects
- Absolute metrical depth cue
- Ex: woman’s holding out her hand b/c the hand here is bigger than her head
- Ex: painting
- We know books are kinda small, around the size of a hand
- Painter painted books and put it near the building; this makes the books look massive
- It seems like the size of the objects is changing
- Ex. it looks like a penny is on the toy car
- In reality, an artist made a huge ass coin and put it on a real car -> took an pic
- This plays w/ our familiar size depth cu
- Aerial perspective: A depth cue that is based on the implicit understanding that light is scattered by the atmosphere
- Although air is transparent, it does filter out some light
- This leads to reduction in contrast, saturation, hue -> cooler colours, blue (bluish)
- Example: Haze (light fog)
- B: the top trees (mountains) that are bluer seem to be farther away
- x
Space perception and binocular vision cont
- Linear perspective
- How does the instrument construct the linear perspective?
- Vanishing points
- how many vanishing points are in real life?
- 3 point perspective
- Foreshortening
- Raphael’s trick in painting
- Limitation of linear perspective
- anamorphosis
- Monocular cues fail
- painting lines in the rm
- Ames room
- Motion parallax
- Ex. column and brick wall when train leaves
- stereokinetic effect
- Linear perspective: A depth cue based on the fact that lines that are parallel in the three-dimensional world will appear to converge in a two-dimensional image
- 1415, Filippo Brunelleschi
- Pioneered in linear perspective
- Sistine Chapel in Vatican, mainly painted by Michelangelo
- There’s one painting done by Filippo
- Used linear perspective
- The // lines converge into a single point at the church
- This produces more depth in painting
- You can construct linear perspective via the instrument below
- The thread indicates the vantage point
- The artist locates the position of the thread on the grid
- Then draws/dots the location on the paper
- Vanishing point: The apparent point at which parallel lines receding in depth converge
- We can have multiple vanishing points, up to 3
- Ex. we can have 2 vanishing points in the horizon, 1 in the vertical dimension -> 3 point perspective
- Brunelesci didn’t know there’s a 3rd one
- The 3rd one: Ex when you stand in front of skyscraper, one of the vanishing point goes into the sky
- Ex. when we look down the skyscraper, we also see lines converging to the ground
- 3-point perspective: discovered after the invention of photo cameras.
- 1415, Filippo Brunelleschi
- Foreshortening: refers to the visual effect that an object or distance appears shorter than it actually is because it is angled toward the projection screen/retina/picture plane.
- Ex. Your pinky has a specific size
- When the hand is oriented to or away from you, the pinky is drawn w/ a smaller size
- Ex. the width of the actual door does not match the frame in the picture (foreshortening) -> creates depth
- Raphael’s tricks
- Square tiles are used in linear perspective paintings
- Linear perspective doesn’t seem to be working all the time: The square tiles can look distorted in the centre and the sides (circles)
- Raphael put stuff in the centre and sides of the painting to cover the square tiles
- Red arrow: the globe is a perfect circle
- In linear perspective, the globe should be distorted
- Raphael ignored rules of linear perspective
- Raphael’s tricks
- Linear perspective is designed to work from only 1 vantage point (ex the artists use a device with a thread to draw an object from only 1 vantage point)
- Pictures are relatively robust to vantage point of the observer. But only to a certain point.
- Ex when you look from the front, picture looks fine
- When you look at it from the side (ex 45 deg angle), it is distorted but your brain recognizes it is a picture and can compensate for it
- Anamorphosis: a distorted projection or perspective requiring the viewer to use special devices or occupy a specific vantage point to reconstitute the image.
- Ex. the smear on the image looks so random
- After adjustment, we see a skull
- Anamorphosis: a distorted projection or perspective requiring the viewer to use special devices or occupy a specific vantage point to reconstitute the image.
- Monocular cues can fail/ trick monocular cues
- Top image: it seems like we are looking at a rectangle w/ 2 oblique lines floating in mid air
- But these are just paint markings on the floors and walls when you take a picture at a different vantage point
- Ames room
- Here the depth cues are removed. The girl on the left is much further away, but the perspective cues are manipulated.
- Only works for a single view point. (look w/ 1 eye in the peep hole
- Ex: girl on the right looks like a giant, but it’s just b/c she is closer (the objects in the room are all distorted)
- x
- Most monocular cues work in paintings
- this one only works when things are in motion
- Motion cues: parallax in time
- Imagine you are on a train that is leaving the platform
- The platform has columns and wall
- The wall seems to be moving less (more slowly) compared to the columns
- Here, objects closer to you (column) move more than objects farther away (wall)
- Motion provides cue for depth/distance
- Motion parallax: the fact that objects moving at a constant speed across the retina will appear to move a greater amount/faster if they are closer to an observer
- X
- Stereokinetic effect (another type of motion parallex): the rotating figure seems to show a large cone (sticking out) and a tiny one (sticking in) rotating
- It is an illusion of depth
- Unlike simple depth illusions (ex Necker cube), stereotkinetic effect is an example of depth from motion
Space perception and binocular vision cont
- Combining cues: 4 main cues on the scene
- Depth Cues involving intra-/extraocular muscles - 3
- Accommodation define
- What does it provide
- Monocular or binocular cue?
- Visual or non-visual cue?
- Convergence & Divergence
- What you eyes do?
- What does it reduce?’
- What does it tell you about the distance of the object?
- Visual or non-visual cue?
- x
- Vergence
- triangulation
- define
- measure river width
- How do our eyes use triangulation?
- Monocular or binocular cue?
- Visual or non-visual cue?
Combining cues
- Most scenes have multiple cues
- • Texture gradient (the stone tiles)
- • Relative height (people towards the top of the painting are smaller)
- • Aerial perspective (the black of the clothes of ppl in the front are more saturated than the black of the clothes of ppl at the back)
- • Linear perspective (the building has 2 vanishing points)
Depth Cues involving intra-/extraocular muscles
- Non-visual monocular cues
- Binocular cues
- Non-visual binocular cues
- X
- Accommodation and vergence help eyes perceive depth:
- • Accommodation: Eye changes its focus (ciliary muscles/intraocular muscle change the shape of the lens)
- Signals for accommodation provides you depth info
- Close by objects: ciliary objects have to contract to make the lens more round; vv
- It is a monocular depth cue (you do not need both eyes)
- But this is NOT a visual monocular depth cue
- This is proprioception/ somatosensation
- • Convergence: Ability of the two eyes to turn inward; reduces the disparity of a feature to (near) zero
- • Divergence: Ability of the two eyes to turn outward; reduces the disparity of the feature to (near) zero
- * It is near zero b/c you don’t want double vision
- The 6 muscles mentioned in this lecture (outside the eyes) that helps w/ convergence and divergence
- When the eyes are converging, this indicates you are looking at an object close by; vv
- This binocular cue is NOT a stereo/visual cue
- It’s about the extraocular muscles outside the eyes
- X
- Vergence: angles of eye positions
- • Triangulation: a technique that helps you tell the distance of things
- Ex want to measure the width of the river
- Running a measuring tape across the lake won’t really work
- # 1: set up 2 instruments, adjust them so both instruments are looking at the same tree at the shore
- # 2: since you know 2 angles of the triangle, and the distance b/w the 2 instruments, you can figure out the width of the lake (hypotenuse)
Vergence (triangulation is also used by our eyes)
- Ex: when we look at the blue crayon, it creates a triangle
- Since we “know” the angles the eyes are converging, we know how far the blue crayon is
- Ex: when we look at the red crayon -> divergence (same logiv)
- This is a binocular cue (uses 2 eyes)
- But NOT a VISUAL binocular cue (b/c we are using the muscles of the eyes)
- Binocular disparity
- stereopsis
- Bob & crayons
- What do his retinal image actually look like (orientations)?
- Dashed lines
- Define Zero binocular disparity
- which 2 crayons has 0 binocular disparity?
- x
- Vieth-Muller Circle/ horopter
- what happens if you fixate on a closer object?
- Panum’s fusion area
- Diplopia
Binocular vision and stereopsis
- Binocular disparity: Differences between the images falling on the two retinas due to parallax
- Stereopsis: “Popping out in depth”
- Most humans are able to see this way; NOT ALL
- How exactly does this translation from stimulus attribute to perception take place?
Stereopsis from binocular disparity
- Bob is looking at the crayons
- He is fixating on red crayons
- So the eyes are rotated/converged
- So the image of the red crayon falls on the fovea
- The images on Bob’s retina will be upside down, and the left and right will be reversed
- Images on Bob’s 2 retinas.
- Dashed lines: Bob is fixating on the red crayon, the image of the red crayon is on his fovea
- Bob fixates red crayon:
- For the red crayon, there are corresponding retinal points
- IOW, the points of retinal images have the same distance from the fovea.
- The red crayon is on the dashed line (fovea), horizontally – distance = 0
- “Zero binocular disparity”.
- This happens when we look at the object w/ both of our eyes
- The same happens to be true for the blue crayon.
- The horizontal distance is not 0
- But there is zero binocular disparity
- The distance b/w the red and blue crayon on the left retinal image = right retinal image (no disparity)
Vieth-Muller Circle
- Horopter: location of objects in space whose images lie on corresponding points. The surface of zero disparity.
- IOW: the red and blue crayons are both on the horopter
- From the top, it looks like a circle (aka Vieth Muller circle; just the horizontal/cross sectional plane
- But in reality, it is 3D, like folding a piece of paper
- If you fixate on a closer object (ex object in front of red crayon), the horopter changes
- Panum’s fusion area: region of space in front and behind the horopter within which binocular single vision is possible.
- A crayon is 3D
- Part of the crayon is sticking out of the c rayon (in front and back)
- IOW: there are parts of the crayon that is not lying on the horopter and does not have zero disparity
- This is also the case for the blue crayon
- But our brain does not perceive those regions as double vision
- These regions = Panum’s fusion area
- Diplopia: double vision for points outside Panum’s fusion area.
- Ex. Brown crayon
Binocular vision and stereopsis cont
- which crayons show relative disparity
- What info does disparity provide?
Crossed vs uncrossed disparity
- “Sign” of disparity
- LHS image
- bottom image = ?
- what are wee fixating at
- how much disparity is there?
- blue crayon location
- Crossed disparity
- RHS image
- where is the horopter
- bottom box = ?
- what are wee fixating at
- how much disparity is there?
- red crayon location
- implication
- uncrossed disparity
- bottom image = ?
- Absolute disparity
- relative disparity
- B&W image
- LHS
- what are we fixating at? (further or closer object?)
- Absolute disparity of further object?
- Absolute disparity of closer object?
- RHS
- what are we fixating at?
- What does +1/2: +ve disparity mean?
- What does -1/2: +ve disparity mean?
- do they have zero disparity?
- The absolute vs relative disparity here
- Main implication
- LHS
Relative disparity
- When we superimpose these images
- We get this
- The red and blue crayon lies inside Panum fusion area
- So we see one blue and one red crayon
- The brown and purple crayon is behind the Panum fusion area
- So we have diopia/double vision for those
- In day light, we do not notice this b/c we tend to focus on the object in front of us (red crayon)
- For objects outside the Panum fusion area
- There are differences in disparity
- There is a larger disparity for purple crayons than brown crayons
- This suggest the purple crayon are located farther away than the brown crayons
- IOW, based on the amount disparity, we can tell how far away the object is from the horopter/panum’s fusion area
- The red and blue crayon lies inside Panum fusion area
Crossed vs uncrossed disparity
- “Sign” of disparity: based on whether the object is located in front our behind the horopter
- Crossing vs. uncrossing the eyes
- LHS: the blue crayon is located in front of the red crayon
- Bottom: what we perceive
- The red crayon is in the middle of picture
- -> We are fixating at the red crayon
- There is 0 disparity; the image of the red crayon is the same on the L and R eye
- The blue crayon is not located in the same position
- Since there is disparity, this suggest the blue crayon is not on the horopter
- Specifically, the blue crayon is in front of the horopter
- In left eye: blue crayon is R of red crayon
- In R eye: vv
- Crossed disparity: object is in front of the horopter
- Bottom: what we perceive
- RHS: when Bob fixates on the blue crayon instead (image of blue crayon is on the fovea)
- The horopter shifts to the blue crayon
- Bottom boxes: the blue crayon is in the middle of the retinal image
- This is b/c Bob is fixating at the blue crayon (zero disparity here)
- For the red crayon is not located in the same position in the left and right eye
- There is non-zero disparity, this suggest the red crayon is not on the horopter
- Specifically, it is behind the horopter b/c
- In the left eye, the red crayon is on the L of blue crayon
- R eye: the red crayon is on the R of the blue crayon
- Thus, this is uncrossed disparity, indicating the objects is behind the horopter
- X
- Absolute vs. relative disparity info can be extracted:
- Absolute disparity: A difference in the actual retinal coordinates in the left & right eyes of the image of a feature in the visual scene
- Relative disparity: The difference in absolute disparities of two elements in the visual scene
- Ex.
- LHS image
- There are 2 crayons
- The eyes are fixating at the object that is located farther away
- For the object located farther away, the absolute disparity = 0
- For the object that is crossed/in front of the horopter, the absolute disparity is not 0, and is positive value (i.e. 1)
- RHS image
- The person is fixating b/w the 2 crayons
- There are 2 diff types of disparity
- +1/2: +ve disparity, closer to the observer, Half the magnitude
- -1/2: -ve disparity; this is b/c the object is behind the horopter
- IOW, there are non-zero disparities for both objects
- Here, since both crayons are located outside of the horopter, the absolute disparities have changed
- The relative disparity: difference b/w the absolute disparities
- LHS: difference = 1 = RHS
- So the relative disparity has NOT changed
- Why is this important?
- Even when we fixate on different things and the horopter changes, we don’t perceive any change in the distance/depth (we can make this out based on relative disparity)
*
- Even when we fixate on different things and the horopter changes, we don’t perceive any change in the distance/depth (we can make this out based on relative disparity)
- Binocular Vision and Stereopsis (cont’d)
- correspondence problem
- Free fusion
- what happens when ppl are stereo blind
- Purpose of Julesz: random dot stereogram
- for the 3 diagrams: RHS → explain
- 3 ways to solve the correspondence problem
- x
- How is stereopsis implemented in the human brain? - 3 steps
Free fusion
- To compute disparity (absolute or relative), we need to know which objects are the same b/w the 2 eyes
- This is another correspondence problem
- There’s 2 photos (stereogram) taken at 2 different vantage points
- Free fusion: The technique of converging (crossing) or diverging the eyes in order to view a stereogram without a stereoscope
- It takes practice
- When we cross our eyes, accommodation is tied to this
- When we cross our eyes, our brain thinks we are seeing smth closer, so it adjusts
- Our accommodation should not change b/c the distance from a screen did not change
- Free fusion: The technique of converging (crossing) or diverging the eyes in order to view a stereogram without a stereoscope
- Some ppl can’t free fuse or see 3D things w/ a stereoscope
Binocular Vision and Stereopsis (cont’d)
- Some people do not experience stereoscopic depth perception because they have stereoblindness
- An inability to make use of binocular disparity as a depth cue
- Can result from a childhood visual disorder, such as strabismus, in which the two eyes are misaligned
- Why are ppl stereoblind? For them, it is challenging to find the corresponding pts in the 2 retinal images
- Julesz: random dot stereograms can only be seen with binocular cues; they contain no monocular depth cues
- Julesz dev these random pixels, some pixels are systematically shifter, and the gaps are replaced w/ more random pixels
- There’s only disparity; and no other depth cues (ex. monocular, linear perspective)
- You can still see 3D
- There’s only disparity; and no other depth cues (ex. monocular, linear perspective)
- Thus, this shows that disparity is sufficient for stereopsis/stereovision. No need for cues from object perception
- X
- Correspondence problem: Figuring out which bit of the image in the left eye should be matched with which bit in the right eye
- Correspondence between two apples that actually are the same apple (easy).
- You can use colors
- Correspondence between pixels that are the same (hard!!!).
- Correspondence between two apples that actually are the same apple (easy).
- How to solve the correspondence problems?
The correspondence problem (part 1)
- There are 3 objects
- LHS: Each eye sees 3 items (1,2,3)
- When you free fuse, you need to rotate your eyes so the free fused image has 3 items (not 6 items)
- LHS: Each eye sees 3 items (1,2,3)
- Centre: you aren’t completely sure there are 3 items in the world
- We only know the L eye detects 3 items; R eye detects 3 items
- RHS: A hypothetical scenario that is possible – there are 5 items
- The L eye sees 3 purple items on the left
- (the 4th item on the right is further away and occluded by one of the item)
- # 2 is also occluded
- R eye only sees 3 items on the right
- 1 purple dot is occluded; the white dot is also occluded
- The L eye sees 3 purple items on the left
- Since you only detect 3 items on L eye, 3 items on R eye, you think the 3 items correspond to each other
- But in reality they actually do not
- This is a really rare case
- X
- A few ways to solve the correspondence problem:
- Blurring the image: Focusing on low-spatial frequency information
* So the white angle (top right) correspond to eo
- Blurring the image: Focusing on low-spatial frequency information
- Uniqueness constraint: A feature in the world will be represented exactly once in each retinal image (1 feature in one eye paired 1 feature in the other eye)
* This means that this occlusion case is so rare that we dismiss it from the get go
- Uniqueness constraint: A feature in the world will be represented exactly once in each retinal image (1 feature in one eye paired 1 feature in the other eye)
- Continuity constraint: Except at the edges of objects, neighboring points in the world lie at similar distances from the viewer
* Ex. the purple dot/crayon- The point on a surface of the crayon has similar distance to the eye
- IOW: once you find correspondence for one point, we can assume the neighboring points have similar distances
- Continuity constraint: Except at the edges of objects, neighboring points in the world lie at similar distances from the viewer
- How is stereopsis (see w/ 2 eyes w/ depth) implemented in the human brain?
- Input from two eyes converges onto the same cell (V1 or later)
- Ex. in the LGN, the input from the 2 eyes happens at diff layers
- IOW, stereovision won’t happen here
- Ex. V1 – receives info from both eyes
- Ex. in the LGN, the input from the 2 eyes happens at diff layers
-
Many binocular neurons (i.e. they have receptive fields for both eyes) respond best when the retinal images are on corresponding points in the two retinas: Neural basis for the horopter
- Ex if the receptive field of 1 eye is directly on the fovea (2 deg left), the receptive field of the other eye will also be on the fovea (2 deg left)
- IOW, cells with receptive fields w/ the same distance from the fovea -> they form the neural basis of the horopter
- When these cells are active, this means something is on the horopter
-
However, many other binocular neurons respond best when similar images occupy slightly different positions on the retinas of the two eyes (tuned to particular binocular disparity)
- Ex. you have a cell w/ a receptive field on the left eye that is 2 degree L of the fovea, the receptive field for the R eye is 3 degree left to the fovea
- This suggest there is binocular disparity b/w the 2 receptive fields
- This is the neural basis of space in front or behind the horopter
- Input from two eyes converges onto the same cell (V1 or later)
Disparity sensitive neurons
- RHS:
- arrows - pathway: 3 steps
- L & R eye - red and blue neuron
- What happens
- Centre
- grey dot
- L & R eye - red and blue neuron
- What happens
- RHS
- grey dot
- L & R eye - red and blue neuron
- What happens
- Binocular Rivalry
Disparity sensitive neurons
- In these images, the eyes are fixating at an object
- Lines = light from the fixation point to the eye’s fovea
- RHS: there are 2 neurons that have receptive fields on the eye
- Arrows indicate indirect connections
- Receptive field on retina (ganglion cell) -> signal reaches LGN -> striate cortex (v1)
- We can see the receptive field of the red neuron on the L eye is further away from the fovea (black line) than that in the right eye
- So, the light hits the fovea
- In the L eye, the red neuron does not detect light
- In the R eye, the red neuron does not detect light
- Since not both of the receptive fields detect light, the red neuron does not respond
- The same case is w/ the blue neuron
- Arrows indicate indirect connections
- Centre image
- The eyes are still fixating at the black dot (horopter is still there)
- The grey dot: indicates there is an object located in front of the black dot
- Black lines = light travelling from the grey object to the fovea
- For the red neuron
- The light falls on to the receptive field of the red neuron in the L and R eye
- The red neuron responds b/c it has cross-disparity in the receptive field
- The grey object is in front of the horopter
- The blue neuron has receptive field at slightly different location
- In the L eye, light does not fall on the receptive field of the blue neuron -> blue neuron does not respond
- RHS
- The eye still fixates at the black dot
- The grey object is located farther away, and light rays travel to the fovea
- Blue neuron: the light rays falls on the receptive field of the blue neuron on the L eye and R eye -> blue neuron responds
- Red neuron: no light falls on it’s receptive field -> no response
- IOW: blue neuron responds to uncrossed disparity (objects behind the horopter)
- Thus, these neurons respond the non-zero retinal disparity (specifically objects locted in front or behind the horopter)
- X
- Disparity indicate you have double vision, and you need to resolve this conflict
- Binocular Rivalry: visual system is struggling b/w this info, one eye dominates the other
- Binocular Rivalry: The competition between the two eyes for control of visual perception, which is evident when completely different stimuli are presented to the two eyes
- Ex. when you free fuse the stereogram below, you get a blend of the 2
*
Space Perception and Binocular Vision
- Bayesian approach
- Optimal inference from cues
- How they are similar
- x
- Coins
- 3 cases (pics: LHS, centre, RHS)
- Which one do we choose based on Bayesian stats?
- Which cue do we use?
- x
- Grey pics
- grey dot
- black dot
- Specific distance tendency
- Equidistance tendency
Space Perception and Binocular Vision
Combining depth cues
- Bayesian approach: A statistical model suggesting that prior knowledge could influence your estimates of the probability of a current event
- Given I sit in the car, how often do I hold the steering wheel
- Given I sit in a chair, how often do I hold the steering wheel
- The probabilities are very different
- Optimal inference from cues: perception should choose the solution depending on which one is most likely.
- Very often perception comes close to what is optimally possible.
- Bayesian inference: calculates what is most optimal
- Perception does this
- x
- Retinal image of a simple scene
- There are 2 American coins
- LHS: It seems like one coin in the R is closer to us than the coin on the L, b/c the coin on the left is occluded
- Centre: Coin of the left is very far away and is massive
- This is unlikely b/c objects that are similar tend to have the same size
- This is dismissed based on our prior k
- RHS: both coins are at the same distance
- Red coin is slightly smaller and has a piece cut out, and we put the 2 coins together
- We dismiss this -> unlikely
- This shows how Bayesian inferences is used in perception: we select the most likely scenario
- X
- How does the visual system decide what you are actually seeing?
- We select the interpretation that is most likely? (Basis of the Bayesian approach)
- Our selection is based on familiar size cue: Prior knowledge
What if there’s no depth information?
- Imagine you are in a dark room, and seeing things in 1 eye
- And there are LEDs
- You don’t know the size of the LEDs and how far away the LEDs are
- Even though we have no depth info, our visual system has the tendency to guess the distance as = 2-4 m-> specific distance tendency
- Grey dot: what we perceive
- Black dot = physical stimuli
- Specific distance tendency: When a simple object is presented in an otherwise dark environment, observers usually judge it to be at a distance of 2-4 m.
- Scenario 2
- In the dark room, you see LEDs closer to you, and another one farther away from you at the same time
- Equidistance tendency: Under the same conditions, an object is usually judged to be at about the same distance from the observer as neighbouring objects. (grey dots)
- Ex: Starry sky
- Why do we have these tendencies?
- This is has to do w/ our prior k
- There are Statistics of natural scenes: in reality, most things we see in the world are away by 2-4m
- Some things are closer to us (there’s usually space b/w you and the object)
- Other objects are behind 4m, but they get occluded
- What happens when our guesses are wrong? – Illusions
- The one on the upper image seems bigger but both men are the same size