Lecture 6 Flashcards

1
Q

What can we do to estimate structure?

A

If we see a point in two images we can assume we know the relative pose, we can then take the first camera to be the origin, as we know the rotation and translation to the second camera and the calibration matrices we can intersect rays to find the 3D location of that point. Each point will give 3 non independent equations, with 2 giving extra information, we can then use the homogenous equation Aw = 0 and solve using the SVD. Where w is the point in question.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the SLAM problem formulation?

A

Given a sequence of observations we want to estimate a map and location. This has two parts, given I know the map what is the pose, or vice versa. Often, updating is just done based on latest measurements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does Fast SLAM work?

A

Fast SLAM firstly uses a particle filter to predict the new pose, and then a Kalman filter on each particle to estimate the map probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does PTAM work?

A

Parallel tracking and mapping splits tracking and mapping into two threads, tracking runs in real time and mapping runs when it can. Pyramid-based matching of FAST features is done with affine transforms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does ORB-SLAM work? Why is it useful?

A

Uses ORB features to generate a sparse feature map, only the local area updates each frame, it can relocalise after tracking failures and perform loop closure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are some main features of LSD-SLAM, SLAM++ and KinectFusion?

A

LSD-SLAM: uses whole images, estimates depth by stereo methods, uses keyframes as tracking references, gives very dense maps.
SLAM++: Detects objects with CNN as features, creating a compact map with semantic meaning.
KinectFusion: uses a depth camera to produce voxels which store the distance to surface.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is drift? What fixes this?

A

Drift is the accumulation of errors in a SLAM map over time, these can be fixed with loop closure, which can be performed when a location is returned to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a bag of words?

A

In bag of words an index is assigned to each word we care about, documents are represented as an array which records the frequency of each word, typically, related words are grouped together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a visual bag of words? How are these generated?

A

This works the same as a bag of words, but uses feature descriptor clusters(similar features) as words.
The steps for this are:
Given a collection of descriptors, apply k-means for some large k to get a vocabulary of k words, allowing us to describe images based on these words. Similar images can then be found based on these frequencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly