Week 7 and 8 - Stereo and Epipolar geometry Flashcards
What is the goal of stereo geometry
Recovering a 3D structure from a 2D image
Structure and depth are inherently ambiguous from single views
What visual cues does an image give us for 3D recovery
shading
texture
focus
perspective
motion
What is stereo vision
Ability of the brain to perceive depth by processing two slightly different images captured by each eye
Based off this, we use a calibrated binocular stereo pair of images
How do we estimate a scene shape using triangulation
Triangulation following the lines from the two image planes to intersect at the real life scene point
Gives reconstruction as intersection of two rays
What does triangulation require us to know
1) Camera pose (calibration - camera parameters)
2) point correspondence - matching of image points
What is the focal length
Distance from the centre of projection O to the image plane
What is (X,Y,Z)
Scene coordinates
What is (x,y,z)
Image coordinates
What is P
The point in the scene and the corresponding point in the projected image
What is the centre of projection
optical/camera centre
refers to the point in space from which the camera’s perspective projection emanates
What is Z
Distance of the camera (O) to the real world point P
same as
Depth (distance from viewer to the point on the object)
What is baseline
The distance between the left and right camera centres (center of projection)
Where is the scene origin placed in the general camera system diagram
centre of the baseline (same y as camera centres)
What is xl and xr
Difference along the x axis between camera centre and point in the image
(c0 and pl)
(c1 and pr)
What is pl and pr
The point p projected in the left and right images
What is the formula for Z
Z = bf / xl - xr
What is disparity
xl - xr
Displacement (along x axis) between conjugate (corresponding) points in left and right images
How does disparity change with object distance to camera
Objects closer to camera → higher disparity → brighter in disparity map
Objects further from camera → lower disparity → darker in disparity map
Why is it important to correctly match points in the left and right image plane
We pass the two rays from camera center through p1 or p2, if these are incorrect the rays will intersect at the completely wrong point P
What are the 3 components of stereo analysis
- Find correspondences
- conjugate pairs of points
- use interesting points
- Reconstruction
- calculate scene coordinates (X,Y,Z)
- Easy, once you have done calibration
- Calibration
- Calculate parameters of cameras (eg b, f…)
What do we assume about finding correspondences in the two images
Assume most scene points are visible in both views
Assume corresponding points are similar
What is the Epipolar constraint
It is a potentially large search space for each candidate point
We place each camera on y axis
so it becomes a 1D search problem along the epipolar line
(still lots of matches)
limits where points from one view will be imaged in other
What are the advantages of using edges for correspondence
They correspond to a significant feature
There arent usually too many of them
We can use image feature (polarity, direction) to verify matches
We can locate them accurately
Multi-scale location (coarse to fine search)
What are the disadvantages of using edges for correspondence
Not all significant structures lie on edges
Edge magnitude features may not be reliable for matching - gradients at corresponding points can be different due to illumination difference
Near-horizontal edges do not provide good localisation - every point along the edge will match with every other point
What detector can we use for finding edges for correspondence
canny detector
What detector can we use for finding interest points
- Moravec operator
- Harris Corners
- LoG (DoG)
What is the Moravec Operator
Non-linear filter
over some neighbourhood
output value is the minimum (eliminates edges)
suppresses non maxima
find points where intensity varies very quickly
What assumption do we make about the cameras
they are callibrated
(know extrinsic parameters relating their poses)