Week 6 and 7 - Local Features Flashcards
What is Exhaustive search in corner detection
Naive approach
Can change the scale (vary patch size) and compare surrounding features
What is wrong with exhaustive search
Computationally inefficient
Unfeasible for large collections of data
Ineffective for recognition tasks
How can we use Automatic scale selection
We need scale invariant feature detection
We want a function that is scale invariant when applied to a region
We consider the function to be any that involves all pixels in the region (eg intensity, texture)
We take the local maximum of this function
We scale the image (squish and expand along x axes)
x = region size
Again find the local maximum (will correspond to the same feature - not same coordinates)
What is “blob detection”
The Laplacian of Gaussian
What is The Laplacian of Gaussian for feature detection
it is scale invariant
Essentially the 2nd derivative of either x or y (partial second derivative)
How do you carry out LoG on an image
Convolve the image with a Gaussian kernel followed by the Laplacian kernel, capturing regions of rapid intensity change
combine the two to make the Laplacian of gaussian kernel (one operation)
What are the characteristics of LoG
Found using a single mask
Orientation information is lost
very noise sensitive (taking derivatives increases noise)
What is the characteristic scale for LoG
The scale that produces the extreme value (peak) of the LoG response
Largest blob
The size of the blob is proportional to the characteristic scale
How are blobs useful for local interest points
They give us the area to use
How do we detect interest points at different scales
Image is convolved with the LoG filter at various levels of blur, determined by the standard deviation (σ) of the Gaussian kernel
Allows for feature detection at different scales
What is the Harris-Laplace method
Multiscale harris corner detection: Identify points that are likely to be corners at different scales
Scale Selection based on Laplacian: Compute LoG at different scales
We do both because harris detection only finds corners and blob detection will find other interest points (combine to variety of interest points)
How can we approximate LoG with DoG for efficient implementation
DoG = G(x, y, kσ)- G(x,y,σ)
(difference of gaussians - from two different levels of blur)
This is more efficient as we do not need to compute 2nd derivative
Gaussians are already automatically being computed in the gaussian pyramid
If we choose the correct values for sigma, we will get a very efficient computation
How does a Gaussian pyramid work
The more Gaussian layers you use, the sigma adds up
large sigma -> more blurring
Increasing scale, lowers the resolution
So we subsample (do not need as many pixels to present a lower resolution image)
How does interest point localisation with DoG work
construct DoG pyramid
detect local maxima in scale space
Thresholding, reject points with low contrast
edge elimination (ratio of principle curvatures)
Note: at this point we are still at candidate points and candidate patches -> have not actually begun to use interest regions
What are the two strategies for scale-invariant detection
LoG
DoG as fast approximation
These can be used on their own or in combinations with single scale key-point detectors (eg harris corner detector)
General Summary: Scale Invariant Detection
● Given: Two images of the same scene with a large scalemdifference between them.
● Goal: Find the same interest points independently in each image
● Solution: Search for maxima of suitable functions in scale and in space (over the image)
What are the two main components of achieving feature invariance
1) Detector is invariant to translation, rotation and scale
Can find interest point and remove the effects of different scales
2) Design an invariant feature descriptor
This captures the information in a region around a detected interest point
What is the simplest local feature descriptor
A square window of pixels
We write a region A as vector a and Region b as vector b
Each vector is a list of intensities within a patch (or some characteristic)
Calculate vector similarities
What is the simplest local feature descriptor invariant to
Translation
but intensity, 3D viewpoint change and rotation will cause big differences
How do detectors remove the effect of different scales for feature matching
Ensure they are scale invariant
use σ which is proportional to scale
normalise scales
How do we use gradients as a feature descriptor
Take the intensity gradient of each pixel in the patch
Take the orientation (ignore magnitude as that is affected by illumination)
And build a histogram (after normalising using dominant direction)
Why is it important to have rotation invariant descriptors
Want to find the dominant direction of gradient for the image patch
Then we rotate the patch according to this angle
This puts the patches into a canonical orientation (normalising)
What is SIFT
Scale Invariant Feature Transform
A descriptor computation
How does SIFT work
1) Localisation, uses DoG to find candidate interest points (takes note of scale)
2) Normalises to predefined size (16x16)
3) check if it is an edge and reject
4) Then finds the gradient orientation over a 16x16 pixel region around interest point
5) Computes histogram of image gradient orientations for all pixels within the patch