Week 5 and 6 - Local Features Flashcards
What is the motivation of using local features
Global representations have limitations
Local features only describe and match local regions
What are local invariant descriptors
another term for local features
What is Image-based object recognition
objects are recognized based on their appearance in images rather than explicit geometric models
What is Model-based object recognition
objects are recognized by comparing them to predefined models or templates
What is a feature
- Local, meaningful, detectable parts of the image
- Location of a sudden change
- ‘salient patches’
What is a salient patch
region or area within an image that stands out due to its significance
Why do we use features
- Information Content High
- Invariant to change of view point, illumination
- Reduces computational burden
What is Visual SLAM
An application of features
“Simultaneous Localisation and Mapping”
Estimating the local geometry and fusing it into a 3D model
Used in augmented reality and autonomous vehicles
Eg holding up phone camera and it being able to map the geometry of the shape in screen
What is Image matching
An application of features
Reverse google searching to find a monument
uses local image features
How is NASA using features
Local features are currently being used on the NASA Mars Rover - trying to stitch together to find a panorama of Mars landscape
What are feature points used for
- Image alignment (homography, fundamental matrix)
- 3D reconstruction
- Motion tracking
- Indexing and database retrieval
- Robot navigation
- … other
What is Image stitching
Procedure:
- Detect feature points in both images
- Find corresponding pairs
- Use these pairs to align the images
General approach to feature-based image matching
- Find a set of distinctive keypoints
- Define a region around each keypoint
- Extract and normalise the region content
- Compute a local descriptor from the normalised region
- Has to be unique
- Match local descriptors
What does it mean to normalise the region content
Make the descriptor invariant to certain transformations (e.g., scale, rotation, illumination)
What are the 2 main problems in feature matching
1) Detect the same point independently in both image
2) For each point, correctly recognise the corresponding one
What is a repeatable detector
Detector that guarantees that it will always find the interest point if present
What are the 4 requirements of region extraction
- Repeatable
- Invariant to translation, rotation and scale changes
- Robust or covariant to out-of-plane transformations
- Robust to lighting variations, noise, blur and quantisation
What are the 5 requirements of local features
-requirements of region extraction fulfilled
- Locality
- Quantity
- Distinctiveness
- Efficiency
What are the 3 main types of detector:
Harris
Laplacian
DoG (difference of gaussians)
What is the important first step in feature detection
Finding candidate locations
Why are edges not ideal candidate locations
Edges only localise in one direction
What are corners
Repeatable points, good candidate locations
Around a corner, image gradient has two or more dominant directions
How does the Harris Corner Detector work
Iterates through an image with a window
Calculate the change in intensity for the shift [u,v]
Minus the intensity from the shifted (new) intensity
Approximate this shift using a matrix M
Calculate the “corner response”
What are the two types of window function for harris detector
Option 1) Basic Harris Corner detector
binary: 1 in window, 0 outside
SVD
compute M and its eigenvalues
Option 2) Smooth with Gaussian Kernel
Bell curve over window
Use smoothed derivatives in M
What is the matrix M in harris corner detection
M = Σw(x,y) [Ix² IxIy, IxIy Iy²]
Where IxIy is gradient w.r.t x times gradient w.r.t y
What is the equation for approximating the shift in harris corner detection
E(u,v) ~= [u,v] M [u, v]
Second [u,v] is a column vector
What do we assume about the datapoints in harris detection
They are coming from a Multivariate Gaussian model
So we can use the eqns for mean (miu^) and covariance (sigma^)
Why do we want to find the covariance matrix in harris corner detection
The covariance matrix describes how the intensity changes are distributed across these different directions
This matrix gives us Eigenvectors (directions that remain unchanged except for a scaling factor by the corresponding eigenvalue)
We essentially are shifting our data along the long axis of the ellipse (eigenvector) in order to calculate variance
The full covariance situation is not ideal
Why are eigenvectors important for changes in intensity calculations
eigenvectors = principal axes along which the intensity changes have the most significant variance
What is Singular value decomposition
SVD
A method of decomposing any matrix into 3 simpler matrices
What 3 matrices are used in SVD in harris corner detection and what do they represent
A = U . D . U^T
u1, u2 (columns of U) capture the axes of the ellipse - Eigenvectors
d1,d1 (columns of D) determine the scale and variance of the data - Eigenvalues
What is the purpose of SVD in Harris detection
SVD = eigenvalue decomposition of matrix M
Tells us the direction and scales of the variation
gives us a new coordinate system u1,u2
How do we use eigenvalues λ1,λ2 (d1,d2) to find a corner
- If both eigenvalues are large, it suggests the presence of a corner (E increases in all directions)
- If one eigenvalue is large and the other is small, it indicates an edge
- If both eigenvalues are small, it represents a flat region
What is the equation for R to measure the corner response
R = det(M) - k x (trace(M))²
R = λ1λ2 - α(λ1 + λ2)²
What is detM
λ1λ2
What is traceM
λ1 + λ2
In the equation for R, what are the best values for constant k
k (also α) is a constant typically set to a value between 0.04 to 0.06
How do we interpret the value of R
R is large → corner
R is negative → large magnitude for an edge
|R| is small → flat region
What is the problem with taking using basic harris detector (no gaussian)
Not rotation invariant
uniform window (1 in window, 0 outside)
How can we use Gaussian smoothing instead over window (option 2)
M = g(σ) * [Ix² IxIy, IxIy Iy²]
instead of Σw(x,y)
The result is rotation invariant
How does gaussian harris detection (Fast approximation) work (option 2)
1) Image derivatives (blur first)
2) Compute Ix², IxIy, IxIy, Iy²
3) Gaussian filter g(Ix²), g(IxIy),…
4) M(σl, σd) = g(σl) * [Ix²(σd) IxIy(σd), IxIy(σd) Iy²(σd)]
Then use M(σl, σd) in R as before:
R = det(M(σl, σd)) - k x (trace(M(σl, σd)))²
What are the properties of Harris operator
Rotation invariant
Not scale invariant → scaling up will cause corners to be classed as edges
The Harris operator does not tell us how much of the surrounding region is involved
precise localisation
high repeatability
What is locality of features
Features are local, therefore robust to occlusion and clutter
What is quantity of features
We need a sufficient number of regions to cover the object
What is distinctiveness of features
The region should contain ‘interesting’ structure
What is efficiency of features
close to real-time performance
What are local features most robust to than global
-Occlusions
-Intra-category variations
σI
scale of Gaussian smoothing applied to the image before computing gradients
σD
scale of derivative operator