Mid Level Vision Flashcards
What are the 5 steps of Canny edge detection?
1) Gaussian Blurring: reduce noise by applying a Gaussian filter
2) Gradient Calculation: apply sobel filter in horizontal and vertical direction to get first derivative of x and y. Then use this to calculate the gradient magnitude and direction for each pixel
3) Non-Maximum suppression: for each pixel, check if it’s the local maximum along the gradient direction. If it is, retain it as an edge, otherwise suppress it (set it to 0)
4) Double Thresholding: choose two thresholds. Pixels with gradient magnitudes above the higher threshold are strong edges, pixels with gradient magnitudes
between the thresholds are weak edges and pixels with gradient magnitudes below the low threshold are discarded.
5) Edge tracking by hysteresis: only retain weak edges if they’re connected to strong edges.
How do computers find similarities/the same feature in two images?
It finds the similarities by calculating the distances between the points
What does invariance mean?
resistant to variation, regardless of changes in conditions or transformations.
What quality do features have if a computer is able to find the same feature in two images?
The feature should be rotation or translation invariant (if it’s the same feature in two images)
Is it possible to change a histogram back into the original image?
No because histograms don’t store any pixel location information.
- Secondly multiple images can produce the same histogram.
How do we calculate a normalised histogram?
We divide each amount by the number of pixels in the image: e.g. if there are 86 0’s in an 8 bit 8x16 image, the normalised value will be 86/128
How do you create a histogram from a coloured image?
You have to create 3 different histograms, one for each colour RGB- red green blue
What are histograms used for?
They are used as features for image classification or recognition
Which geometric operations are histograms invariant to?
- Rotation
- scaling
- mirroring
What are LBP descriptors and how do they work?
Local binary pattern descriptors describe the surroundings of a pixel.
- They generate a bit-code:
- The threshold is the central pixel: moving in a clockwise direction, each pixel is compared to the threshold.
- If the pixel value is equal to or above the threshold it’s set to 1. If it’s below the threshold it’s set to 0
What is the LBP code of an image?
How many values will this code always hold
It is the vector of 1’s and 0’s created by moving clockwise around a pixel and comparing the pixel value to the central pixels value. Hence this will always hold 8 values.
e.g. = [0 1 1 1 0 1 0 1]
What is the LBP index?
What is the LBP index of the following LBP code? [0, 1, 1, 1, 0, 1, 0, 1]
- The LBP index is calculated by summing the base 10 values of the LBP code:
- [0 1 1 1 0 1 0 1] = 0 + 64 + 32 + 16 + 4 + 1 = 117
What is image classification?
given an input image, assign a label to it from a fixed set of categories
How do we perform image classification?
what is the output?
We extract features from the image and train a classifier to categorise the image.
The output is a vector of the probability for each of the categories that the image contains that category.
What are the challenges of image classification?
- viewpoint variation
- illumination variation
- scale variation
- deformations
- occlusion
- background clutter
- intra-class variation
What is a traditional approach to training an image classifier?
1) find the edges of the image
2) find the corners in the edges
3) classify the object
Describe the steps of the machine learning image classification method?
1) construct a training dataset of images and predefined labels
2) train the classifier to produce predictions of the probability that the object is accurately labelled
3) Evaluate new images that have no predefined labels
What does a training dataset used by a ML image classifier consist of?
A collection of images of each of the objects that there are labels for
What does the image classifier consist of in a ML image classification method?
It is a neural network or a traditional classifier (edges method) can be used
What are the steps of the K-nearest neighbour image classification method?
1) we have a number (e.g. 3) of existing categories and a new point
2) calculate the distance between the new point and all the existing points
3) rank the points by increasing order of distance
4) choose k nearest neighbours and label the new point as that category of which the new points has the most nearest neighbours
How does the 1KK method work?
the classifier takes a test image, compares it to all the training images and predicts the label using the 1 closest training image to find the category.
What are the two types of distance measures?
L1 and L2
Describe how L1 distance is calculated for two images?
L1 is the sum of the absolute values of the difference between the two images pixel values
Describe how L2 distance is calculated for two images?
AKA Elucidean distance: the square root of the sum of the difference between two images pixel values squared