Feature points Flashcards
Name three methods for Image Matching.
1) By pixels
2) By edges
3) By feature (interest) points
How can we detect corners by looking at the neighborhood of a pixel in an image?
If we have a corner and move the neighborhood in any direction, the pixel values in the neighborhood should change a lot. (High gradient in all direction)
How can we approximate E(u,v) the error when shifting the neighborhood by u,v pixels?
E(u,v) = [u,v] M [u, v]^T where M is the matrix:
M = [Ix^2 Ixy
Iyx Iy^2]
and Ix, Iyx, Iy are the derivatives.
If l1, l2 are the eigenvalues of the M matrix, what kind of image structures are we looking at when:
1) l1»_space; l2
2) l1 and l2 are small
3) l1 and l2 are large
1) An edge
2) A flat region
3) A corner
Given the M matrix, what metrics can be used to determine if we have a corner?
1) R = min(lambda_1, lambda_2)
2) R = lambda_1lambda_2 /(lambda_1 + lambda_2 + epsilon)
3) R = lambda_1lambda_2 - k(lambda_1 + lambda_2)^2
4) R= det(M) - k*trace(M)^2
where lambda_1 and lambda_2 are the eigenvalues
Describe the Harris Corner Detector
1) compute Ix and Iy
2) Create M
3) calculate eigenvalues lambda_1, lambda_2
4) Calculate the Respons R = lambda_1lambda_2 + k(lambda_1 + lambda_2)^2
5) Threshold and Non-Maxima repression with respect to R
What other structures than corners can the Harris detector detect?
Textures and blobs
Is the Harris detector invariant to rotation?
Yes
How can we make the Harris detector invariant to scale?
Perform Gaussian filtering at different scales and use the highest response. (The image has to be multiplied by sigma to make the derivatives comparable.)
What is the Harris Laplace detector?
For each pixel, use find the optimal scale using the LoG response and calculate the Harris Respons at that scale.
How are DoG and LoG filters correlated
DoG(x,y,k,sigma) = I * G(ksigma) - I *G(sigma) approx= (k-1)sigma^2Log(x,y,sigma)
How does the SIFT algorithm find the optimal response
By using DoGs with different scales and downsampling the image. The extremum indicates the optimal scale for each pixel.
Describe the difference between the Harris/ Laplace detector and the SIFT detector
Harris/ Laplace:
- Use the LoG at different scales to find an optimal scale
- Find local maxima by using the Harris response
SIFT:
- Use DoG at different scales to find the optimal scale
- Find local maxima using the DoG response
What is the simplest feature descriptor, and what are the disadvantages?
Use pixel value.
Sensitive to brightness and a lot of pixels will have the same descriptor.
What are the disadvantages of using a local patch to describe feature points?
1) Sensitive to brightness
2) Sensitive to rotation
What are the advantages/disadvantages of using a local patch of gradients to describe feature points?
Advantage: Not sensitive to (additive) brightness
Disadvantage: Still sensitive to other brightness changes and not rotation invariant.
What are the advantages/disadvantages of using a histogram of a local patch of gradients to describe feature points?
Advantage: Rotation invariant
Disadvantage: Sensitive to changes in brightness.
Describe the SIFT algorithm for keypoint detection
1) Find scale/ space extrema using DoG response
2) Fit a quadratic function over space to the extremes and estimate a refined new keypoint (can be in between pixels…)
3) Threshold keypoint responses
Describe how we can determine the canonical orientation of each patch in the SIFT algorithm
1) Create a histogram of 36 bins for degrees from 0-360
2) Each pixel in a neighborhood vote for an orientation weighted by gradient magnitude.
3) The keypoint is assigned an orientation corresponding to the largest bin.
How is the SIFT descriptor created?
1) Take a small window around the keypoint
2) weigh gradients near the center more.
3) Calculate the gradient orientation and magnitude after using a Gaussian filter.
4) Calculate the canonical orientation
5) Rotate all gradient directions relative to the canonical orientation
6) Create a histogram of gradient orientations for each subregion in the local window (Originally 16 subregions and 8 direction histogram).
7) These 16 histograms form a 128-feature vector and are used as a descriptor.
How can we determine what keypoints match in different images?
use the nearest neighbor or approximations of the nearest neighbour (FLANN)
How can we use keypoints to match images?
Create an affine equation for each keypoint use least squares solution
How can we make image matching more robust to outliers?
Use RANSAC ( RANdom SAmple Consensus).
1) Choose the minimal amount of points needed to determine the transformation
2) Check how many points are inliers
3) If this model has the most inliers, save this model.
4) Continue until termination.
Describe the main advantages of SIFT
1) Robust to intensity changes
2) Invariant to scale
3) Invariant to rotation
What is the main difference between SURF and SIFT?
SURF is faster since it only considers horizontal and vertical gradients using the Haar wavelet. It uses 4 descriptors for each subregion, (sum dx, sum dy, sum |dx| , sum |dy|) and the total descriptor is of length 16*4=64.
Shortly describe the main idea behind Binary descriptors
All points in a neighborhood are only compared to one other point. The descriptor value i is 0 if the first point is largest and 1 if the second point is largest. We can use the Hamming metric (XOR) for matching, making it extremely fast and we only need 1 byte for each element in the descriptor.