Week 6 and 7 - Local Features Flashcards

1
Q

What is Exhaustive search in corner detection

A

Naive approach
Can change the scale (vary patch size) and compare surrounding features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is wrong with exhaustive search

A

Computationally inefficient
Unfeasible for large collections of data
Ineffective for recognition tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can we use Automatic scale selection

A

We need scale invariant feature detection
We want a function that is scale invariant when applied to a region
We consider the function to be any that involves all pixels in the region (eg intensity, texture)
We take the local maximum of this function
We scale the image (squish and expand along x axes)
x = region size
Again find the local maximum (will correspond to the same feature - not same coordinates)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is “blob detection”

A

The Laplacian of Gaussian

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is The Laplacian of Gaussian for feature detection

A

it is scale invariant
Essentially the 2nd derivative of either x or y (partial second derivative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you carry out LoG on an image

A

Convolve the image with a Gaussian kernel followed by the Laplacian kernel, capturing regions of rapid intensity change

combine the two to make the Laplacian of gaussian kernel (one operation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the characteristics of LoG

A

Found using a single mask
Orientation information is lost
very noise sensitive (taking derivatives increases noise)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the characteristic scale for LoG

A

The scale that produces the extreme value (peak) of the LoG response
Largest blob
The size of the blob is proportional to the characteristic scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How are blobs useful for local interest points

A

They give us the area to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do we detect interest points at different scales

A

Image is convolved with the LoG filter at various levels of blur, determined by the standard deviation (σ) of the Gaussian kernel
Allows for feature detection at different scales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Harris-Laplace method

A

Multiscale harris corner detection: Identify points that are likely to be corners at different scales
Scale Selection based on Laplacian: Compute LoG at different scales

We do both because harris detection only finds corners and blob detection will find other interest points (combine to variety of interest points)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we approximate LoG with DoG for efficient implementation

A

DoG = G(x, y, kσ)- G(x,y,σ)
(difference of gaussians - from two different levels of blur)
This is more efficient as we do not need to compute 2nd derivative
Gaussians are already automatically being computed in the gaussian pyramid

If we choose the correct values for sigma, we will get a very efficient computation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does a Gaussian pyramid work

A

The more Gaussian layers you use, the sigma adds up
large sigma -> more blurring
Increasing scale, lowers the resolution
So we subsample (do not need as many pixels to present a lower resolution image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does interest point localisation with DoG work

A

construct DoG pyramid
detect local maxima in scale space
Thresholding, reject points with low contrast
edge elimination (ratio of principle curvatures)

Note: at this point we are still at candidate points and candidate patches -> have not actually begun to use interest regions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two strategies for scale-invariant detection

A

LoG
DoG as fast approximation

These can be used on their own or in combinations with single scale key-point detectors (eg harris corner detector)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

General Summary: Scale Invariant Detection

A

● Given: Two images of the same scene with a large scalemdifference between them.
● Goal: Find the same interest points independently in each image
● Solution: Search for maxima of suitable functions in scale and in space (over the image)

17
Q

What are the two main components of achieving feature invariance

A

1) Detector is invariant to translation, rotation and scale
Can find interest point and remove the effects of different scales

2) Design an invariant feature descriptor
This captures the information in a region around a detected interest point

18
Q

What is the simplest local feature descriptor

A

A square window of pixels
We write a region A as vector a and Region b as vector b
Each vector is a list of intensities within a patch (or some characteristic)
Calculate vector similarities

19
Q

What is the simplest local feature descriptor invariant to

A

Translation
but intensity, 3D viewpoint change and rotation will cause big differences

20
Q

How do detectors remove the effect of different scales for feature matching

A

Ensure they are scale invariant
use σ which is proportional to scale
normalise scales

21
Q

How do we use gradients as a feature descriptor

A

Take the intensity gradient of each pixel in the patch
Take the orientation (ignore magnitude as that is affected by illumination)
And build a histogram (after normalising using dominant direction)

22
Q

Why is it important to have rotation invariant descriptors

A

Want to find the dominant direction of gradient for the image patch
Then we rotate the patch according to this angle
This puts the patches into a canonical orientation (normalising)

23
Q

What is SIFT

A

Scale Invariant Feature Transform
A descriptor computation

24
Q

How does SIFT work

A

1) Localisation, uses DoG to find candidate interest points (takes note of scale)

2) Normalises to predefined size (16x16)

3) check if it is an edge and reject

4) Then finds the gradient orientation over a 16x16 pixel region around interest point

5) Computes histogram of image gradient orientations for all pixels within the patch

25
Q

How does Orientation Normalisation work

A

The resulting gradient orientation of the patch is quantised into 8 bins over 0-360 degrees

  • compute orientation histogram
    -select dominant orientation
    -normalise: rotate to fixed orientation
26
Q

How does SIFT form vectors

A

It then divides the initial 16x16 window into 4x4 sub-patches: 16 cells each
compute histogram of gradient orientations for pixels in the sub patch ( 8 reference angles/bins)

Essentially for each sub patch, we have a vector of 8 dimensions (all invariant of scale and rotation)
We do this further sub patch computation to find finer details

(128 dimension: 4x4x8)

27
Q

What are the advantages of SIFT

A
  • can handle changes in viewpoints up to ~60 degrees
    -can handle significant changes in illumination
    -fast and efficient
28
Q

What does one image computed with SIFT yield

A

n 128-dimensional descriptors
n scale parameters specifying size of each patch
n orientation parameters specifying angle of each patch
n 2D points giving positions of patches

29
Q

What are the two main components of match a features I1 to I2

A
  1. Define a distance function that compares the two descriptors
  2. Test all the features in I2, find the one with min distance
30
Q

How do we define a distance between two features f1, f2 (in descriptor I1, I2)

A

SSD(f1, f2)
sum of square difference between entries of two descriptors

31
Q

Why is SSD not good for distance calculating feature matching

A

Can give a very good score for a bad match (ambiguous)

32
Q

How to we improve SSD feature matching distance calculation

A

Ratio distance = SSD(f1, f2) / SSD(f1, f2’)

f2 = best SSD match to f1
f2’ = 2nd best SSD match to f1

33
Q

What will the ratio distance be for ambiguous matches

A

Large values (~1)

34
Q

How do we eliminate bad matches in feature matching

A

Thresholding
Throw out features with larger distance than threshold

high threshold -> more false positives
low threshold -> less true positives

35
Q

What is a ROC Curve

A

plots the true positive rate (y axis/recall) against the false positive rate (x axis/precision)

We want to maximise the area under the curve
useful for comparing different feature matching methods

36
Q

Advantages of local features

A
  • Critical to find distinctive and repeatable local regions for multi-view matching
  • Complexity reduction via selection of distinctive points.
  • Describe images, objects, parts without requiring
    segmentation; robustness to clutter & occlusion.
  • Robustness: similar descriptors in spite of moderate view changes, noise, blur, etc.