Robot Learning Flashcards
What is epipolar geometry?
Epipolar geometry is the intrinsic projective geometry between two views. It is
independent of scene structure, and only depends on the cameras’ internal parameters
and relative pose.
What is the fundamental matrix in relation to the geometry of two views?
The fundamental matrix F encapsulates the intrinsic geometry of the views, and is a 3x3 matrix with rank 2. If a point in 3-space X is imaged as x in the first view, and x′ in the second, then the image points satisfy the relation x′(transposed) Fx = 0.
What is the relation between the corresponding image points x and x′?
the image points x and x′, space point X, and camera centres are coplanar. Denote this plane as π. Clearly, the rays back-projected from x and x′ intersect at X, and the rays are coplanar, lying in π.
how is the corresponding point x′ constrained?
The plane π is determined by the baseline and the ray defined by x. We know that the ray corresponding to the (unknown) point x′ lies in π, hence the point x′ lies on the line of intersection l’ of π with the second image plane.
What is the epipolar line corresponding to x?
the point x′ lies on the line of intersection l′ of π with the second image plane. This line l′ is the image in the second view of the ray back-projected from x
What is the benefit of utilizing epipolar geometry for a stereo correspondence algorithm?
The search for a particular point corresponding to x is constrained to the epipolar line l’, instead of searching the entire image plane.
What is the epipole?
The point of intersection of the line joining the camera centres (the baseline) with the image plane. Equivalently, the epipole is the image in one view of the camera centre of the other view. It is also the vanishing point of the baseline
(translation) direction.
What is the epipolar plane?
A plane containing the baseline (the line joining the camera centers) - represented by a single parameter family of epipolar planes.
What is the epipolar line?
The intersection between the epipolar plane with the image plane.
What are deformable parts models(DPM)?
use a sliding window approach where a
classifier is run at evenly spaced locations over an entire image
What are region proposal methods?
methods to first generate potential bounding boxes in an image and then run a classifier on these proposed boxes. After classification, post-processing is used to refine the bounding boxes, eliminate duplicate detections, and rescore the boxes based on other objects in the scene. R-CNNs are an example.
What’s the key weakness of region proposal methods?
Pipelines can be too slow and complex to optimize since each component of the pipeline needs to be trained separately.
What architecture does the YOLO (You only look once by Redmon, Divvala, Girshick, Farhadi) paper present?
Proposes a unified architecture that has a single CNN that predicts both the bounding boxes and the associated probabilities. YOLO trains on full images and directly optimizes detection performance
What are the main strengths proposed by the YOLO paper?
Architecture is fast, learns generalizable representations of objects, and reasons globally about images instead of using methods such as region proposal methods and DPMs
What are the main weaknesses with the YOLO architecture?
YOLO lags behind in accuracy compared to SOTA models and struggles with localizing particularly smaller objects in images.