Week 7 and 8 - Stereo and Epipolar geometry Flashcards

1
Q

What is the goal of stereo geometry

A

Recovering a 3D structure from a 2D image
Structure and depth are inherently ambiguous from single views

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What visual cues does an image give us for 3D recovery

A

shading
texture
focus
perspective
motion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is stereo vision

A

Ability of the brain to perceive depth by processing two slightly different images captured by each eye

Based off this, we use a calibrated binocular stereo pair of images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we estimate a scene shape using triangulation

A

Triangulation following the lines from the two image planes to intersect at the real life scene point
Gives reconstruction as intersection of two rays

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does triangulation require us to know

A

1) Camera pose (calibration - camera parameters)

2) point correspondence - matching of image points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the focal length

A

Distance from the centre of projection O to the image plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is (X,Y,Z)

A

Scene coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is (x,y,z)

A

Image coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is P

A

The point in the scene and the corresponding point in the projected image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the centre of projection

A

optical/camera centre
refers to the point in space from which the camera’s perspective projection emanates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Z

A

Distance of the camera (O) to the real world point P
same as
Depth (distance from viewer to the point on the object)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is baseline

A

The distance between the left and right camera centres (center of projection)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where is the scene origin placed in the general camera system diagram

A

centre of the baseline (same y as camera centres)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is xl and xr

A

Difference along the x axis between camera centre and point in the image
(c0 and pl)
(c1 and pr)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is pl and pr

A

The point p projected in the left and right images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula for Z

A

Z = bf / xl - xr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is disparity

A

xl - xr
Displacement (along x axis) between conjugate (corresponding) points in left and right images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How does disparity change with object distance to camera

A

Objects closer to camera → higher disparity → brighter in disparity map

Objects further from camera → lower disparity → darker in disparity map

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why is it important to correctly match points in the left and right image plane

A

We pass the two rays from camera center through p1 or p2, if these are incorrect the rays will intersect at the completely wrong point P

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 3 components of stereo analysis

A
  • Find correspondences
    • conjugate pairs of points
    • use interesting points
  • Reconstruction
    • calculate scene coordinates (X,Y,Z)
    • Easy, once you have done calibration
  • Calibration
    • Calculate parameters of cameras (eg b, f…)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do we assume about finding correspondences in the two images

A

Assume most scene points are visible in both views
Assume corresponding points are similar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the Epipolar constraint

A

It is a potentially large search space for each candidate point
We place each camera on y axis
so it becomes a 1D search problem along the epipolar line
(still lots of matches)
limits where points from one view will be imaged in other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the advantages of using edges for correspondence

A

They correspond to a significant feature
There arent usually too many of them
We can use image feature (polarity, direction) to verify matches
We can locate them accurately
Multi-scale location (coarse to fine search)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the disadvantages of using edges for correspondence

A

Not all significant structures lie on edges

Edge magnitude features may not be reliable for matching - gradients at corresponding points can be different due to illumination difference

Near-horizontal edges do not provide good localisation - every point along the edge will match with every other point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What detector can we use for finding edges for correspondence
canny detector
26
What detector can we use for finding interest points
- Moravec operator - Harris Corners - LoG (DoG)
27
What is the Moravec Operator
Non-linear filter over some neighbourhood output value is the minimum (eliminates edges) suppresses non maxima find points where intensity varies very quickly
28
What assumption do we make about the cameras
they are callibrated (know extrinsic parameters relating their poses)
29
Do we normally have the triangulated stereo system irl
No, that is rare usually the camera axis are not parallel, the cameras usually are on completely different axes Correspondence is much harder to find
30
How do we constrain the correspondence in the non parallel camera axis situation
Again using epipolar lines two epipolar lines on each image reduces the problem to 1d search space
31
What is the epipolar plane
plane containing baseline(connecting two camera centres) and world point p An epipolar plane intersects with the left and right image planes in epipolar lines
32
What is the epipole
point of intersection between baseline and image plane
33
what is the epipolar line
intersection of epipolar plane with the image plane All epipolar lines intersect at the epipole
34
where do the Potential matches for p lie
on the corresponding epipolar line l’ and vice versa for p' on l
35
What are the 4 main features in generalised stereo geometry
-cameras are at arbitrary orientations (image planes are not parallel) -separation between optical centres is not parallel to image planes (baseline) -camera coordinate systems are different from each other and from the scene coordinate system - coordinate system are related by: - rotation matrices and translation vectors are used to define camera origin and coordinate system in relation to the scene system
36
What is Rr, Rl and Tr, Tl and fl, fr
The rotation matrcies, translation vectors and focal length respectively for the right and left cameras
37
What is pl (or pr)
Vector from the origin Ol to the point pl in the image plane, can be multiplied by some scalar value al to reach real world point P
38
what can be said about fl and fr
they are not necessarily the same
39
What can be write pl and pr in terms of
pl = (xl,yl,fl) the coordinates of pl in the image and the focal length of camera to image plane
40
What is Pl and Pr
Pl = plal (how to reach real world point P) al, ar are scalar values
41
What is P'l and P'r
P'l = Tl + alRlpl P'r = Tr + arRrpr
42
Where to P'l and P'r intersect
Where Tl + alRlpl = Tr + arRrpr real world point P
43
Why is it not simple to mathematically solve for P
because of common measurement inaccuracies
44
What is the Essential matrix
E = [Tx]R It relates corresponding image points between cameras using translation and rotation
45
How can we write X' and X using the essential matrix
from: X' . (T x RX) = 0 X' . ([Tx] x RX) = 0 We can write X'^T EX =0
46
How can we use the essential matrix to solve the parallel camera system
R = I T = [-d, 0,0] (the disparity) E = [Tx]R p'^T Ep = 0 leads us to y = y' (image of any point must lie along the same horizontal line)
47
What is Rectification
Knowing Rr and Rl means we can transform (warp) the images so that the image planes are parallel Means the search space is now just along the parallel epipolar lines
48
What does rectification mean about the epipoles
epipoles are at infinity
49
What does rectification mean about the epipolar lines
epipolar lines are parallel to the horizontal image axis
50
what does the epipolar constraint make faster
the search for correspondences
51
What are the 4 main steps of stereo reconstruction
-calibrate cameras -rectify images -compute disparity -estimate depth
52
What are the soft constraints of correspondences
similarity uniqueness ordering (help further reduce the possible matches)
53
To find matches in the image pair, what do we assume
-most scene points are visible from both views -image regions for the matches are similar in appearance
54
What is the Dense Correspondence Search
For each pixel in first image: -find corresponding epipolar line in other image -examine all pixels on line and pick best match (eg SSD) -triangulate the matches to get depth information
55
When is dense correspondence search easiest
When epipolar lines are scanelines -> rectify images first
56
What is window-based correspondence search
Slide a window along epipolar lines to find the corresponding pixel localises the search
57
What is the effect of window size of correspondence search
Want it to be large enough to have sufficient intensity variation small enough to contain only pixels with the same(ish) disparity large window -> image appears blotchy and blurred small window -> fine grain noise
58
What is sparse correspondence search
Restrict search to sparse set of detected features (dense, finding correspondence for every pixel -> was too noisy) Only use a few set of important pixels use feature descriptor and associated feature distance (epipolar constraint still applies : only search along particular epipolar line)
59
Dense adv disadv
-simple process -more depth estimates, can be useful for surface reconstruction But -breaks down in textureless regions -raw pixels can be brittle -not good with very different viewpoints
60
Sparse adv disadv
sparse -efficient -can have more reliable feature matches (less sensitive to illumination) - but have to know enough to know how to pick good features
61
Difficulties in similarity constraint
-Textureless regions in an image lack distinct features or patterns -occlusions - Flat or homogeneous regions contain pixels with similar intensity values and little or no gradient information Hard to match pixels in these areas
62
why can raw pixel distances be brittle
sensitive to noise, illumination, perspective not robust
63
What is the ordering constraint
points on same surface (opaque object) will be in the same order in both views violates by transparent objects
64
What are possible sources of error
-low contrast/ textureless -occlusions -camera calibration errors -violations of brightness constancy (specular reflections) -large motions
65
What are 3 main applications of 3D scene reconstructions
Depth for segmentation View Interpolation - synthesizing new views of a scene from existing views captured by a stereo camera setup or multiple cameras Virtual Viewpoint video - allows viewers to interactively navigate and explore a scene from different viewpoints in real-time (3D virtual tours)
66
What are Intrinsic camera parameters
-Relate pixel coordinates to image coordinates -Pixel size (sx, sy): pixels may not be square -Origin offset (dx, dy): pixel origin may not be on optic axis We just neet the ratio between pixel grid and image plane -Focal length, f. -Not totally independent. (Need dx, dy, f, sx/sy )
67
What are extrinsic camera parameters
- Rotation matrix R (3X3) (3 free parameters) - Translation vector (Tx, Ty, Tz)
68
What is a calibration target
checkboard using to calibrate camera
69
What can we do with an uncalibrated stereo system
- Calibration is necessary to determine absolute 3D positions - We can determine relative 3D positions (up to a scale factor) without calibration - If at least 8 correspondences in the scene are known sufficient; camera parameters can be estimated (Human stereopsis just uses relativity in this way)
70
What is the tradeoff in calibration systems
The accuracy versus how long it is going to take