4. Image Processing Flashcards

Question

Why the 1st derivative might be problematic in the real-world images?

Answer 1

Because small variations in pixel intensity will lead to big derivatives. Problem: where is the threshold of the 1st derivative?

Answer 2

Similarly to 1D. We compute the partial derivatives in both x and y direction. It is approximated using linear filters: x-direction: 1 0 -1 1/6 * 1 0 -1 1 0 -1 y-direction: 1 1 1 1/6 * 0 0 0 -1 -1 -1 This combines the edge detection with smoothing (box-filter) but there is a problem of box filter: we get box artifacts in our image (bad smoothing) We can use the Sobel filter (use gaussian smoothing with edge detection). Idea is gaussian filter: pixels closer influence more than the further away ones. x-direction: 1 0 -1 1/8 * 2 0 -2 1 0 -1 y-direction: 1 2 1 1/8 * 0 0 0 -1 -2 -1

Answer 3

x-direction 2D edge detection, because we detect differences in x direction (we detect when white is left and black is right, and vice versa). That means we get vertical edges, but horizontal ones are bad.

Answer 4

1. We have an image and we apply gaussian filter to reduce the noise. 2. Compute partial derivatives of the image (x and y direction) using Sobel filters. With the Sobel filters, we also have the gaussian filtering. This results in two images: horizontal and vertical edge detection 3. Calculate the gradient magnitude (combining the Ix and Iy from the step 2 into one image). This results in an image with edge detection. 4. We can apply some thresholding to only have high gradient magnitudes. Consider the hysteresis to have edges connect. 5. Thinning: performing the non-maximum suppression. This process improves the localization: only takes pixels which are the local maximums along the gradient direction. It makes the edges thin

Answer 5

They are convolutional filters that approximate partial derivatives. They also perform gaussian smoothing in opposite direction of the edge detection. They are used in edge detection and x-direction filter detects vertical edges.

Answer 6

- We only use linear filters - The noise is iid (independent identically distributed)

Answer 7

When doing edge detection and we perform thresholding, we will see that edges are dashed lines, they stop and continue etc. This is due to high threshold but if we put the threshold down, we might introduce fake edges / noise. To fix this, we can use hysteresis. There are two thresholds. 1. is the high threshold that should detect true edges only. The second is a bit lower and pixels that pass the second but not the first one are kept only if some noughboring pixels passed the 1. threshold. Do this iteratively

Answer 8

When we have computed the magnitudes of the gradient and we performed some thresholding (optionally), multiple pixels along the edge can pass the threshold (thick edge). To make this edge thin (perform better localization), we do the non-maximum suppression. It basically checks if pixel is local maximum along the gradient direction. For each pixel, the largest gradient is calculated. Pixels p and r are taken along the direction of the gradient that intersect the next row or the next column (pixel row/column) and we check if the selected pixel is bigger than both of them. If it is, we keep it and if not, we remove it. The p and r are approximated using linear interpolation (averaging 2 pixels that p/r are in between)

Answer 9

1 -1 1 -2 1

Answer 10

Idea is to find zero crossings of the second derivative. The second derivative can be approximated using linear filter 1 -2 1. In 2D, this is 0 1 0 1 -4 1 0 1 0 or 1 1 1 1 -8 1 1 1 1 To perform second derivative with gaussian smoothing, we use Laplacian of Gaussian (LoG) which is just a convolution of gaussian filter and laplacian filter. The LoG in 2D has a mexian hat shape and can be approximated using DoG (difference of gaussians)

Answer 11

Laplacian of Gaussian can be approximated using DoG (difference of gaussians). Ususlly signa1 = 1.6 sigma2 and when we subtract the functions, we get the LoG.

Answer 12

1. We compute the Gaussian pyramid. Then, each layer of gaussian pyramid we expand (each pixels is copied 4 times). This means that we have a high resolution image of the information from the lower resolution. To get the Laplacian pyramid, we have to subtract the gaussian layer with the expanded Gi+1 layer. This will result in Li layer of laplacian pyramid. What this represents is what information is lost when we went from Gi to Gi+1. At the lower levels of the pyramid, the high-frequency information is lost (represented in the laplacian pyramid) and as we move upwards, we get lower and lower frequency information (in laplacian pyramid). We can reconstruct the original image using the laplacian pyramid since the top-most image is the copy of Gn = Ln. We can reverse the process.

Answer 13

Image sharpening. For example, if we want to make the edges crisper, we can multiply some of the laplacian layers with some constant (like 1.1) and reconstruct the image. This should make those details more visible.

4. Image Processing Flashcards

(37 cards)