Week 6 and 7 - Local Features Flashcards by Isabel Draper

What is Exhaustive search in corner detection

Naive approach
Can change the scale (vary patch size) and compare surrounding features

How well did you know this?

Not at all

Perfectly

What is wrong with exhaustive search

Computationally inefficient
Unfeasible for large collections of data
Ineffective for recognition tasks

How well did you know this?

Not at all

Perfectly

How can we use Automatic scale selection

We need scale invariant feature detection
We want a function that is scale invariant when applied to a region
We consider the function to be any that involves all pixels in the region (eg intensity, texture)
We take the local maximum of this function
We scale the image (squish and expand along x axes)
x = region size
Again find the local maximum (will correspond to the same feature - not same coordinates)

How well did you know this?

Not at all

Perfectly

What is “blob detection”

The Laplacian of Gaussian

How well did you know this?

Not at all

Perfectly

What is The Laplacian of Gaussian for feature detection

it is scale invariant
Essentially the 2nd derivative of either x or y (partial second derivative)

How well did you know this?

Not at all

Perfectly

How do you carry out LoG on an image

Convolve the image with a Gaussian kernel followed by the Laplacian kernel, capturing regions of rapid intensity change

combine the two to make the Laplacian of gaussian kernel (one operation)

How well did you know this?

Not at all

Perfectly

What are the characteristics of LoG

Found using a single mask
Orientation information is lost
very noise sensitive (taking derivatives increases noise)

How well did you know this?

Not at all

Perfectly

What is the characteristic scale for LoG

The scale that produces the extreme value (peak) of the LoG response
Largest blob
The size of the blob is proportional to the characteristic scale

How well did you know this?

Not at all

Perfectly

How are blobs useful for local interest points

They give us the area to use

How well did you know this?

Not at all

Perfectly

How do we detect interest points at different scales

Image is convolved with the LoG filter at various levels of blur, determined by the standard deviation (σ) of the Gaussian kernel
Allows for feature detection at different scales

How well did you know this?

Not at all

Perfectly

What is the Harris-Laplace method

Multiscale harris corner detection: Identify points that are likely to be corners at different scales
Scale Selection based on Laplacian: Compute LoG at different scales

We do both because harris detection only finds corners and blob detection will find other interest points (combine to variety of interest points)

How well did you know this?

Not at all

Perfectly

How can we approximate LoG with DoG for efficient implementation

DoG = G(x, y, kσ)- G(x,y,σ)
(difference of gaussians - from two different levels of blur)
This is more efficient as we do not need to compute 2nd derivative
Gaussians are already automatically being computed in the gaussian pyramid

If we choose the correct values for sigma, we will get a very efficient computation

How well did you know this?

Not at all

Perfectly

How does a Gaussian pyramid work

The more Gaussian layers you use, the sigma adds up
large sigma -> more blurring
Increasing scale, lowers the resolution
So we subsample (do not need as many pixels to present a lower resolution image)

How well did you know this?

Not at all

Perfectly

How does interest point localisation with DoG work

construct DoG pyramid
detect local maxima in scale space
Thresholding, reject points with low contrast
edge elimination (ratio of principle curvatures)

Note: at this point we are still at candidate points and candidate patches -> have not actually begun to use interest regions

How well did you know this?

Not at all

Perfectly

What are the two strategies for scale-invariant detection

LoG
DoG as fast approximation

These can be used on their own or in combinations with single scale key-point detectors (eg harris corner detector)

How well did you know this?

Not at all

Perfectly

General Summary: Scale Invariant Detection

Study These Flashcards

● Given: Two images of the same scene with a large scalemdifference between them.
● Goal: Find the same interest points independently in each image
● Solution: Search for maxima of suitable functions in scale and in space (over the image)

What are the two main components of achieving feature invariance

Study These Flashcards

1) Detector is invariant to translation, rotation and scale
Can find interest point and remove the effects of different scales

2) Design an invariant feature descriptor
This captures the information in a region around a detected interest point

What is the simplest local feature descriptor

Study These Flashcards

A square window of pixels
We write a region A as vector a and Region b as vector b
Each vector is a list of intensities within a patch (or some characteristic)
Calculate vector similarities

What is the simplest local feature descriptor invariant to

Study These Flashcards

Translation
but intensity, 3D viewpoint change and rotation will cause big differences

How do detectors remove the effect of different scales for feature matching

Study These Flashcards

Ensure they are scale invariant
use σ which is proportional to scale
normalise scales

How do we use gradients as a feature descriptor

Study These Flashcards

Take the intensity gradient of each pixel in the patch
Take the orientation (ignore magnitude as that is affected by illumination)
And build a histogram (after normalising using dominant direction)

Why is it important to have rotation invariant descriptors

Study These Flashcards

Want to find the dominant direction of gradient for the image patch
Then we rotate the patch according to this angle
This puts the patches into a canonical orientation (normalising)

What is SIFT

Study These Flashcards

Scale Invariant Feature Transform
A descriptor computation

How does SIFT work

Study These Flashcards

1) Localisation, uses DoG to find candidate interest points (takes note of scale)

2) Normalises to predefined size (16x16)

3) check if it is an edge and reject

4) Then finds the gradient orientation over a 16x16 pixel region around interest point

5) Computes histogram of image gradient orientations for all pixels within the patch

How does Orientation Normalisation work

The resulting gradient orientation of the patch is quantised into 8 bins over 0-360 degrees - compute orientation histogram -select dominant orientation -normalise: rotate to fixed orientation

How does SIFT form vectors

It then divides the initial 16x16 window into 4x4 sub-patches: 16 cells each compute histogram of gradient orientations for pixels in the sub patch ( 8 reference angles/bins) Essentially for each sub patch, we have a vector of 8 dimensions (all invariant of scale and rotation) We do this further sub patch computation to find finer details (128 dimension: 4x4x8)

What are the advantages of SIFT

- can handle changes in viewpoints up to ~60 degrees -can handle significant changes in illumination -fast and efficient

What does one image computed with SIFT yield

n 128-dimensional descriptors n scale parameters specifying size of each patch n orientation parameters specifying angle of each patch n 2D points giving positions of patches

What are the two main components of match a features I1 to I2

1. Define a distance function that compares the two descriptors 2. Test all the features in I2, find the one with min distance

How do we define a distance between two features f1, f2 (in descriptor I1, I2)

SSD(f1, f2) sum of square difference between entries of two descriptors

Why is SSD not good for distance calculating feature matching

Can give a very good score for a bad match (ambiguous)

How to we improve SSD feature matching distance calculation

Ratio distance = SSD(f1, f2) / SSD(f1, f2') f2 = best SSD match to f1 f2' = 2nd best SSD match to f1

What will the ratio distance be for ambiguous matches

Large values (~1)

How do we eliminate bad matches in feature matching

Thresholding Throw out features with larger distance than threshold high threshold -> more false positives low threshold -> less true positives

What is a ROC Curve

plots the true positive rate (y axis/recall) against the false positive rate (x axis/precision) We want to maximise the area under the curve useful for comparing different feature matching methods

Advantages of local features

- Critical to find distinctive and repeatable local regions for multi-view matching - Complexity reduction via selection of distinctive points. - Describe images, objects, parts without requiring segmentation; robustness to clutter & occlusion. - Robustness: similar descriptors in spite of moderate view changes, noise, blur, etc.

Week 6 and 7 - Local Features Flashcards

(36 cards)