03-05 - VSLAM, BA, SfM Flashcards
What is the general bundle block adjustment problem & how are BA problems usually solved?
Non-linear error-minimization problem
x + correction = lambdaPX
we try to minimize the reprojection error
Numerically we end up saying:
observations + corrections = A * unknowns
where the unknowns can be split up into the 3d points and the 6d cam parameters and a is split up into C (obsxpoints) and B(obsximgs)
What is a gauge-freedom, where does it usually appear in BA problems and how can it be fixed?
It means we have multiple solutions
It usually appears if we do not have any known control points
To fix it: Add constraints and priors, controll points, loop closure
How do gross errors affect BA problems?
Since we do take the sum of squares, a single outlier can fuck up the whole result. It is very important to remove outliers before
Outliers can f.ex. be caused be wrong feature matching.
When and why do we need a sparse solver for BA?
If we have large sets of data to save computation time. (So mostly when using global BA)
How and why do we derive Jacobians for BA problems?
It represents the partial derivatives of the reprojection error with respect to the parameters being optimized.
The Jacobian matrix provides information about how changes in the parameters affect the reprojection error, which is crucial for finding the optimal parameter values that minimize the error.
Can be found analytically and numerically.
What defines visual odometry?
Motion Estimation with the help visual input
It may use local optimization but not global.
Why do VO solutions tend to drift?
The error is accumulating over time
Which kinds of correspondences can be used in VO?
2D-2D - reprojection error
3D-2D - reprojection error
3D-3D - 3D point difference
3d-3d has the disatvantage of the 3D point calculation being uncertain, on the other hand stereo has the advantage before monocular, that the scalefactor is not unknown and there is no scaling drift. Local ba should be used no matter which method is chosen.
How is the essential matrix estimated from consecutive frames?
5 Point Method
8 Point Method: Longuet-Higgins, all 8 pointpairs are put in a vector, E is vecotrized, solve: p_2^T * E * p_1 = 0
now we have a Ax = 0 problem where we want to find x (which is E)
We are not interested in the trivial solution where E is only zeros, therefore we do not use fx gaussian elemination, but SVD to solve
How is relative motion computed from the essential matrix?
SVD
When should robust estimation (e.g. RANSAC) be used in VO?
Outlier removal to prepare for BA
Causes for outliers can be image noise, occlusion, blur and changes in viewpoint/illumination that the mathematical model of feature descriptors does not account for.
What is loop detection and closure?
After some time detecting same features again, and making sure in the map that the loop is closed.
Explain the Graph-SLAM approach.
The graph represends the problem, every node a pose of the robot during mapping (the states). The edges correspond to spatial constraints between the poses (relative transforms, but very uncertain). So an edge between two nodes correspond to the odometry measurement. It exists if the robot either moves from the one pose to the other or if the robot observes the same part of the environment from both poses
Even though we see the same thing, we are not in the same position with the camera yet, so we need to find that last transform to be able to close the loop:
$X_i^{-1}X_j$ , where $X_i$ is the transformation from origin to $x_i$ and $X_i^{-1}$ is the inverse transformation.
Appearance based SLAM vs feature based SLAM
Appearance based:
- uses intesity information of all pixels
- computationally heavy less accurate
- Global
Feature based
- uses only salient and repeatabæe features across images
- fast, accurate, requires ability to match accross frames
- local
What is the purpose of front-end and back-end in SLAM?
Making the system applicable in real-time (VO in front end, BA in backend)