w3 w gemini Flashcards
What is a coordinate transformation (in the context of 3D space)?
A change of basis between different reference frames.
What is another term for a coordinate transformation?
A change of basis.
Define a reference frame (or coordinate system) in 3D space.
An arbitrary set of orthogonal axes (X, Y, Z) used to measure the position and orientation of points or items.
What is Visual SLAM (Simultaneous Localisation And Mapping)?
A process used by robots and drones to orient themselves in the real world and create a map of their surroundings.
In Visual SLAM, what coordinate system does the map typically start with?
The robot base’s coordinate system.
In Visual SLAM, what transformation does the robot perform for each new image?
Transforms detected scene points from the camera coordinate system to the robot’s, and then to the world coordinate system.
In multi-view pose estimation, what do we estimate from each camera?
The complete 2D body pose (e.g., skeleton with 2D joints) for each person.
In multi-view pose estimation, how do we obtain an estimate of the 3D body pose?
By converting the 2D estimate into an accurate 3D pose with respect to the camera’s reference frame.
In multi-view pose estimation, what can be chosen as the world reference frame?
The reference frame of one of the cameras, or any arbitrary origin.
What is a convolution mask (or kernel)?
A small matrix used in image processing to perform operations like blurring, sharpening, or edge detection.
What is the purpose of padding an image with zeros before convolution?
To produce an output image of the same size as the input image.
What is a rotation matrix in the context of coordinate transformations?
A 3x3 matrix that describes the rotation needed to align the orientation of one frame to another.
What is a translation vector in the context of coordinate transformations?
A 3x1 vector that describes the translation needed to align the origin of one frame to another.
Explain the concept of ‘extrinsic’ and ‘intrinsic’ rotations (briefly).
Rotations around world axes (extrinsic) vs. rotations around the object’s local axes (intrinsic).
What is aliasing in the context of image down-sampling?
Distortion or misrepresentation of the image due to insufficient sampling. (Downsampling problem of)
How is aliasing avoided when down-sampling images for an image pyramid?
By smoothing the image (convolving with a Gaussian mask) prior to down-sampling.
What is a Gaussian image pyramid?
A multiscale representation of an image at different resolutions obtained by iteratively convolving with a Gaussian filter and down-sampling.
What is a Laplacian image pyramid?
A multiscale representation of an image highlighting intensity discontinuities, obtained by subtracting a Gaussian-smoothed image from the previous level and down-sampling.
Why is it computationally cheaper to keep the mask size fixed and vary the image size in multiscale feature analysis?
Because the number of multiplications required for convolution is significantly reduced.
Give an example of the computational advantage of varying image size over varying mask size in multiscale analysis.
Convolving a 100x100 image with a 6x6 mask requires more multiplications than convolving a 50x50 image with a 3x3 mask.
What is the formula for a 2D Gaussian?
G(x,y) = (1 / (2πσ²)) * exp(-(x² + y²) / (2σ²))
What are the four main categories of image features that can produce intensity-level discontinuities?
Depth discontinuities, Orientation discontinuities, Reflectance discontinuities, Illumination discontinuities.
What is the mathematical operation for convolution?
I’(i, j) = Σk,l I(i+k, j+l)H(−k,−l)
What is the purpose of convolving an image with a Laplacian mask?
To detect edges and regions of rapid intensity change.
What is a Laplacian of Gaussian (LoG) mask?
A convolution mask created by combining a Laplacian mask with a Gaussian mask.
Why is combining a Laplacian mask with a Gaussian mask advantageous for edge detection?
The Gaussian smooths noise, making the Laplacian less sensitive to it while still detecting edges.