Week 7 and 8 - Stereo and Epipolar geometry Flashcards

1
Q

What is the goal of stereo geometry

A

Recovering a 3D structure from a 2D image
Structure and depth are inherently ambiguous from single views

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What visual cues does an image give us for 3D recovery

A

shading
texture
focus
perspective
motion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is stereo vision

A

Ability of the brain to perceive depth by processing two slightly different images captured by each eye

Based off this, we use a calibrated binocular stereo pair of images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we estimate a scene shape using triangulation

A

Triangulation following the lines from the two image planes to intersect at the real life scene point
Gives reconstruction as intersection of two rays

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does triangulation require us to know

A

1) Camera pose (calibration - camera parameters)

2) point correspondence - matching of image points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the focal length

A

Distance from the centre of projection O to the image plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is (X,Y,Z)

A

Scene coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is (x,y,z)

A

Image coordinates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is P

A

The point in the scene and the corresponding point in the projected image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the centre of projection

A

optical/camera centre
refers to the point in space from which the camera’s perspective projection emanates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Z

A

Distance of the camera (O) to the real world point P
same as
Depth (distance from viewer to the point on the object)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is baseline

A

The distance between the left and right camera centres (center of projection)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Where is the scene origin placed in the general camera system diagram

A

centre of the baseline (same y as camera centres)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is xl and xr

A

Difference along the x axis between camera centre and point in the image
(c0 and pl)
(c1 and pr)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is pl and pr

A

The point p projected in the left and right images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the formula for Z

A

Z = bf / xl - xr

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is disparity

A

xl - xr
Displacement (along x axis) between conjugate (corresponding) points in left and right images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How does disparity change with object distance to camera

A

Objects closer to camera → higher disparity → brighter in disparity map

Objects further from camera → lower disparity → darker in disparity map

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why is it important to correctly match points in the left and right image plane

A

We pass the two rays from camera center through p1 or p2, if these are incorrect the rays will intersect at the completely wrong point P

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 3 components of stereo analysis

A
  • Find correspondences
    • conjugate pairs of points
    • use interesting points
  • Reconstruction
    • calculate scene coordinates (X,Y,Z)
    • Easy, once you have done calibration
  • Calibration
    • Calculate parameters of cameras (eg b, f…)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What do we assume about finding correspondences in the two images

A

Assume most scene points are visible in both views
Assume corresponding points are similar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the Epipolar constraint

A

It is a potentially large search space for each candidate point
We place each camera on y axis
so it becomes a 1D search problem along the epipolar line
(still lots of matches)
limits where points from one view will be imaged in other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the advantages of using edges for correspondence

A

They correspond to a significant feature
There arent usually too many of them
We can use image feature (polarity, direction) to verify matches
We can locate them accurately
Multi-scale location (coarse to fine search)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the disadvantages of using edges for correspondence

A

Not all significant structures lie on edges

Edge magnitude features may not be reliable for matching - gradients at corresponding points can be different due to illumination difference

Near-horizontal edges do not provide good localisation - every point along the edge will match with every other point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What detector can we use for finding edges for correspondence

A

canny detector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What detector can we use for finding interest points

A
  • Moravec operator
  • Harris Corners
  • LoG (DoG)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the Moravec Operator

A

Non-linear filter
over some neighbourhood
output value is the minimum (eliminates edges)
suppresses non maxima
find points where intensity varies very quickly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What assumption do we make about the cameras

A

they are callibrated
(know extrinsic parameters relating their poses)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Do we normally have the triangulated stereo system irl

A

No, that is rare
usually the camera axis are not parallel, the cameras usually are on completely different axes
Correspondence is much harder to find

30
Q

How do we constrain the correspondence in the non parallel camera axis situation

A

Again using epipolar lines
two epipolar lines on each image
reduces the problem to 1d search space

31
Q

What is the epipolar plane

A

plane containing baseline(connecting two camera centres) and world point p

An epipolar plane intersects with the left and right image planes in epipolar lines

32
Q

What is the epipole

A

point of intersection between baseline and image plane

33
Q

what is the epipolar line

A

intersection of epipolar plane with the image plane

All epipolar lines intersect at the epipole

34
Q

where do the Potential matches for p lie

A

on the corresponding epipolar line l’

and vice versa for p’ on l

35
Q

What are the 4 main features in generalised stereo geometry

A

-cameras are at arbitrary orientations (image planes are not parallel)
-separation between optical centres is not parallel to image planes (baseline)
-camera coordinate systems are different from each other and from the scene coordinate system
- coordinate system are related by:
- rotation matrices and translation vectors are used to define camera origin and coordinate system in relation to the scene system

36
Q

What is Rr, Rl and Tr, Tl and fl, fr

A

The rotation matrcies, translation vectors and focal length respectively for the right and left cameras

37
Q

What is pl (or pr)

A

Vector from the origin Ol to the point pl in the image plane, can be multiplied by some scalar value al to reach real world point P

38
Q

what can be said about fl and fr

A

they are not necessarily the same

39
Q

What can be write pl and pr in terms of

A

pl = (xl,yl,fl)

the coordinates of pl in the image and the focal length of camera to image plane

40
Q

What is Pl and Pr

A

Pl = plal
(how to reach real world point P)
al, ar are scalar values

41
Q

What is P’l and P’r

A

P’l = Tl + alRlpl
P’r = Tr + arRrpr

42
Q

Where to P’l and P’r intersect

A

Where
Tl + alRlpl = Tr + arRrpr
real world point P

43
Q

Why is it not simple to mathematically solve for P

A

because of common measurement inaccuracies

44
Q

What is the Essential matrix

A

E = [Tx]R
It relates corresponding image points between cameras using translation and rotation

45
Q

How can we write X’ and X using the essential matrix

A

from:
X’ . (T x RX) = 0
X’ . ([Tx] x RX) = 0

We can write
X’^T EX =0

46
Q

How can we use the essential matrix to solve the parallel camera system

A

R = I
T = [-d, 0,0]
(the disparity)
E = [Tx]R

p’^T Ep = 0

leads us to y = y’ (image of any point must lie along the same horizontal line)

47
Q

What is Rectification

A

Knowing Rr and Rl means we can transform (warp) the images so that the image planes are parallel

Means the search space is now just along the parallel epipolar lines

48
Q

What does rectification mean about the epipoles

A

epipoles are at infinity

49
Q

What does rectification mean about the epipolar lines

A

epipolar lines are parallel to the horizontal image axis

50
Q

what does the epipolar constraint make faster

A

the search for correspondences

51
Q

What are the 4 main steps of stereo reconstruction

A

-calibrate cameras
-rectify images
-compute disparity
-estimate depth

52
Q

What are the soft constraints of correspondences

A

similarity
uniqueness
ordering

(help further reduce the possible matches)

53
Q

To find matches in the image pair, what do we assume

A

-most scene points are visible from both views
-image regions for the matches are similar in appearance

54
Q

What is the Dense Correspondence Search

A

For each pixel in first image:
-find corresponding epipolar line in other image
-examine all pixels on line and pick best match (eg SSD)
-triangulate the matches to get depth information

55
Q

When is dense correspondence search easiest

A

When epipolar lines are scanelines
-> rectify images first

56
Q

What is window-based correspondence search

A

Slide a window along epipolar lines to find the corresponding pixel
localises the search

57
Q

What is the effect of window size of correspondence search

A

Want it to be large enough to have sufficient intensity variation
small enough to contain only pixels with the same(ish) disparity
large window -> image appears blotchy and blurred
small window -> fine grain noise

58
Q

What is sparse correspondence search

A

Restrict search to sparse set of detected features
(dense, finding correspondence for every pixel -> was too noisy)
Only use a few set of important pixels
use feature descriptor and associated feature distance
(epipolar constraint still applies : only search along particular epipolar line)

59
Q

Dense adv disadv

A

-simple process
-more depth estimates, can be useful for surface reconstruction

But
-breaks down in textureless regions
-raw pixels can be brittle
-not good with very different viewpoints

60
Q

Sparse adv disadv

A

sparse
-efficient
-can have more reliable feature matches (less sensitive to illumination)

  • but have to know enough to know how to pick good features
61
Q

Difficulties in similarity constraint

A

-Textureless regions in an image lack distinct features or patterns
-occlusions
- Flat or homogeneous regions contain pixels with similar intensity values and little or no gradient information

Hard to match pixels in these areas

62
Q

why can raw pixel distances be brittle

A

sensitive to noise, illumination, perspective
not robust

63
Q

What is the ordering constraint

A

points on same surface (opaque object) will be in the same order in both views

violates by transparent objects

64
Q

What are possible sources of error

A

-low contrast/ textureless
-occlusions
-camera calibration errors
-violations of brightness constancy (specular reflections)
-large motions

65
Q

What are 3 main applications of 3D scene reconstructions

A

Depth for segmentation

View Interpolation
- synthesizing new views of a scene from existing views captured by a stereo camera setup or multiple cameras

Virtual Viewpoint video
- allows viewers to interactively navigate and explore a scene from different viewpoints in real-time (3D virtual tours)

66
Q

What are Intrinsic camera parameters

A

-Relate pixel coordinates to image coordinates
-Pixel size (sx, sy): pixels may not be square
-Origin offset (dx, dy): pixel origin may not be on optic axis
We just neet the ratio between pixel grid and image plane
-Focal length, f.
-Not totally independent. (Need dx, dy, f, sx/sy )

67
Q

What are extrinsic camera parameters

A
  • Rotation matrix R (3X3) (3 free parameters)
  • Translation vector (Tx, Ty, Tz)
68
Q

What is a calibration target

A

checkboard using to calibrate camera

69
Q

What can we do with an uncalibrated stereo system

A
  • Calibration is necessary to determine absolute 3D positions
  • We can determine relative 3D positions (up to a scale factor) without calibration
  • If at least 8 correspondences in the scene are known sufficient; camera parameters can be
    estimated
    (Human stereopsis just uses relativity in this way)
70
Q

What is the tradeoff in calibration systems

A

The accuracy versus how long it is going to take