L3: VSLAM 1 Flashcards

Question 1

Q

What is WIndowed BA?

Answer

A

Iteratively refine over the last m number of frames to obtain a more accurate estimate of the local trajectory.

Question 2

Q

❗️❗️❗️What defines VO?

Answer

A

Process of incrementally estimating the pose of the camera by
examining the changes that motion induces on the images.

Question 3

Q

What is VSLAM?

Question 4

Q

❗️❗️❗️What is the difference between VO and VSLAM?

Answer

A

VO:
1. aims at LOCAL consistency of the trajectory
2. building block of SLAM
3. VO is SLAM before closing the loop!
4. There is drift (“solved” with windowed BA)

VSLAM:
1. aims at GLOBAL consistency of the trajectory
2. uses LC

Question 5

Q

What is the difference between VO and SFM?

Question 6

Q

What different moion estimation exist?

Answer

A

2D-2D
3D-3D
3D-2D

Question 7

Q

What is Loop Closure (LC)?

Question 8

Q

What is Structured for Motion (SFM)?

Question 9

Q

What properties are important when performing VO?

Answer

A

Sufficient illumination in the environment
Dominance of static scene over moving objects (Stationary object preferred over e.g. moving cars)
Enough texture to allow apparent motion to be extracted
Sufficient scene overlap between consecutive frames

Question 10

Q

Advantages of VO

Answer

A

VO > wheel odometry → Not affected by wheel slip

Question 11

Q

Flow chart of VO

Answer

A

Image sequence
↓
Feature detection
↓
Feature matching (tracking)
↓
Motion estimation
(2D-2D, 3D-3D, 3D-2D)
↓
Local optimization

Question 12

Q

Why would we use BA and not just VO?

Answer

A

Computes camera path incrementally (pose after pose), the errors introduced by each new frame-to-frame motion accumulate over time. Generates drift of the estimated trajectory from the real path.
To keep small as possible BA is needed as it determined the projection error that minimizes it.

WIndowed BA as a solution to VO locally

Question 13

Q

Which two approaches can be used to estimate the relative motion (T_k) between frames?

Under input sequence process

Answer

A

Appearance-based → Intensity information of all pixels in both images. Slow, computationally heavy, worse at estimating and dense.
Feature-based → Repeatable features extracted across the images. Faster and more accurate, and sparse.

Question 14

Q

What is 2D-2D?

Answer

A

Both features are defined in 2D. - Mostly used in monocular VO
Minimal-case solution involves 5-point correspondences (Nister).
Or you can use 8-point correspondences (Longurt). Uses SVD at the end.

Question 15

Q

What is 3D-3D?

Answer

A

Both features are specified in 3D
Triangulate 3D points
Minimal-case solution involves 3 non-collinear correspondences
Solution is found by aligning transformation that minimizes 3D-3D distance

Question 16

Q

What is 3D-2D

Answer

Study These Flashcards

A

Previous frame in 3D and current frame in 2D
Known as PnP problem
Minimal-case solution involves 3 correspondences
Solution is found by determining the transformation that minimizes reprojection error

Question 17

Q

What affects triangulation uncertainty and how can the uncertainty be reduced?

Answer

Study These Flashcards

A

Question 18

Q

What advantages are there to using stereo vision?

Answer

Study These Flashcards

A

Stereo vision has an advantage over monocular as it has less drift.
- Through when the distance to the scene is much larger than the baseline for stereo, VO becomes ineffective and you should monocular VO

Windowed BA should always be used for accurate estimate of the trajectory.

Keyframes should be selected carefully to reduce drift.

Question 19

Q

What casues outliers among feature points?

Answer

Study These Flashcards

A

Wrong data associations
Wrong measurements

Causes:
- image noise
- occlusions
- blur
- change in view and illumination (depending on the feature extractor/descriptor)

Question 20

Q

What is Robust Estimation?

Answer

Study These Flashcards

A

To remove outliers so camera motion can be estimated accurately

Question 21

Q

How can outliers be removed from estimation?

Answer

Study These Flashcards

A

RANSAC
Estimates fraction of inliers adaptively, iteratively.

Question 22

Q

How can glocal consistency be achieved?

Answer

Study These Flashcards

A

After loop closure are detected, use bundle adjustment to optimize the camera path!!!

L3: VSLAM 1 Flashcards

(22 cards)