Exam Flashcards by Patrick Henriksen

Describe the steps of Image classification

1) Feature extraction
2) Feature description
3) Classification

How well did you know this?

Not at all

Perfectly

Describe the typical pre-processing of digit recognition in general images

Detect the digits in the large image
Normalize the size of the digit, for example, to 28x,28 pixels
Normalize the location, place mass center in the middle
“Slant” Make the orientation canonical

How well did you know this?

Not at all

Perfectly

Name some advantages and disadvantages of K-Nearest Neighbour

It works reasonably well
No training required
Nonlinear decision boundaries
Multi-class
All training data must be stored in memory
Long evaluation time

How well did you know this?

Not at all

Perfectly

Name the three conditions Canny proposed for a good edge detector

1) Good detection; should detect all edges
2) Good localization; should detect edges where they are
3) A single response, should only detect edges where they are

How well did you know this?

Not at all

Perfectly

Describe Canny’s algorithm for edge detection

1) Gaussian filtering
2) Calculate gradient magnitude and direction
3) Perform non-maximum repression
4) Perform hysteresis thresholding

How well did you know this?

Not at all

Perfectly

How can we approximate E(u,v) the error when shifting the neighborhood by u,v pixels?

E(u,v) = [u,v] M [u, v]^T where M is the matrix:

M = [Ix^2 Ixy
Iyx Iy^2]

and Ix, Iyx, Iy are the derivatives.

How well did you know this?

Not at all

Perfectly

Given the M matrix, what metrics can be used to determine if we have a corner?

1) R = min(lambda_1, lambda_2)
2) R = lambda_1lambda_2 /(lambda_1 + lambda_2 + epsilon)
3) R = lambda_1lambda_2 - k(lambda_1 + lambda_2)^2
4) R= det(M) - k*trace(M)^2

where lambda_1 and lambda_2 are the eigenvalues

How well did you know this?

Not at all

Perfectly

Describe the Harris Corner Detector

1) compute Ix and Iy
2) Create M
3) calculate eigenvalues lambda_1, lambda_2
4) Calculate the Respons R = lambda_1lambda_2 + k(lambda_1 + lambda_2)^2
5) Threshold and Non-Maxima repression with respect to R

How well did you know this?

Not at all

Perfectly

How are DoG and LoG filters correlated

DoG(x,y,k,sigma) = I * G(ksigma) - I *G(sigma) 
approx= (k-1)sigma^2Log(x,y,sigma)

How well did you know this?

Not at all

Perfectly

Describe the SIFT algorithm for keypoint detection

1) Find scale/ space extrema using DoG response
2) Fit a quadratic function over space to the extremes and estimate a refined new keypoint (can be in between pixels…)
3) Threshold keypoint responses

How well did you know this?

Not at all

Perfectly

Describe how we can determine the canonical orientation of each patch in the SIFT algorithm

1) Create a histogram of 36 bins for degrees from 0-360
2) Each pixel in a neighborhood vote for an orientation weighted by gradient magnitude.
3) The keypoint is assigned an orientation corresponding to the largest bin.

How well did you know this?

Not at all

Perfectly

How is the SIFT descriptor created?

1) Take a small window around the keypoint
2) weigh gradients near the center more.
3) Calculate the gradient orientation and magnitude after using a Gaussian filter.
4) Calculate the canonical orientation
5) Rotate all gradient directions relative to the canonical orientation
6) Create a histogram of gradient orientations for each subregion in the local window (Originally 16 subregions and 8 direction histogram).
7) These 16 histograms form a 128-feature vector and are used as a descriptor.

How well did you know this?

Not at all

Perfectly

Describe the main advantages of SIFT

1) Robust to intensity changes
2) Invariant to scale
3) Invariant to rotation

How well did you know this?

Not at all

Perfectly

What is the main difference between SURF and SIFT?

SURF is faster since it only considers horizontal and vertical gradients using the Haar wavelet. It uses 4 descriptors for each subregion, (sum dx, sum dy, sum |dx| , sum |dy|) and the total descriptor is of length 16*4=64.

How well did you know this?

Not at all

Perfectly

Name some improvements to neural networks the past 30 years

1 ) Better hardware

2) Deeper networks
3) Larger datasetets
4) Other changes, better activation funcitons, different layers…

How well did you know this?

Not at all

Perfectly

How can we adapt gradient descent to fix the vanishing and exploding gradient problems?

Study These Flashcards

1) We can use adaptiv stepsizes.

2) We can Clip the gradient using thresholding or L2 norm.

What is the R_CNN method?

Study These Flashcards

Uses selective search for region proposal, SVM for classification and linear regression for localization. Both use CNN features.

What is the fast R_CNN method

Study These Flashcards

Uses selective search for proposal, CNN for localization and classification

What is a RoI pooling layer?

Study These Flashcards

Converts convo feature maps into a fixed size. Used because region proposals can be of arbitary size.

What is the main difference between fast R_CNN and faster R_CNN?

Study These Flashcards

Faster R_CNN uses a a proposal CNN. The proposal and and detection networks share feature maps.

How does the Region Proposal Network work in faster R_CNN?

Study These Flashcards

It uses a 3x3 sliding window On the convo feature map and proposes a number of bounding boxes with different scales, aspects and anchors.

Name som advantages/ disadvantages of K-Means clustering for segementation

Study These Flashcards

+ Fast and easy to implement

- No semantics and supervised methods often perform better

How can we use K-means clustering for image segmentation?

Study These Flashcards

1) On the histogram of the pixel intensities
2) Colour similarity
3) position and colour similarity
4) Others…

How can we use a classification convo net for segmentation?

Study These Flashcards

Change the fully connected parts at the end to convo. This gives a heat map of class probabilities. Use tranposed convolayer to upsample this heat map.

What are the basic assumptions behind optical flow?

Brightness constancy and small displasement

What do we know about I(x+u, y+v, t+1) given displacement (u,v) and the brigthness constancy assumption?

It is the same as I(x,y,t)

What is the optical flow constraint?

dI/du * u + DI/dv * v + dI/Dt = 0

What assumption does the the Lucas Kanade method make?

Flow, or displacement, is constant in a local neighbourhood

What is the condition number of a 2-d matrix?

abs(lambda_max) / abs(lambda_min)

How can the condition number be interpreted?

Large conditon number makes the matrix inverse sensitive to noise and small changes in values.

What is the idea behind the Horn- Schunk method?

# Define a global energy function: Integral_(x,y) ((Ix*u) + (Iy * v) + I_t)^2 + alpha(abs(gradient(u))^2 + abs(gradient(u))^2) dxdy And minimize the energy with respect to u(x,y), v(x,y).

What can we do if the small displacement condition does not hold?

Use spatial pyramids of downsampled images. Estimate motion from the coarsest (smallest) image first and iteratively to larger versions.

Derive the optical flow constraint

Constant brightness: I(x+u, y+v, t+1) = I(x,y,t) Taylor: I(x+u, y+v, t+1) = I(x,y,t) + Ixu + Iyu + It Combining: Ix + Iy + It = 0

What is the: 1) formula for calculating output sizes of convo layers 2) formula for calculating k for dilation layers

1) (Xin + 2p - k) / s + 1 | 2) k + (k-1)(d-1)

Exam Flashcards

(34 cards)