Computer Vision Flashcards

Question 1

Q

Image derivatives

Answer

A

An image derivative is defined as the change in the pixel value of an image. The rate of change of a function is defined by the equation used for differentiating from first principles (AS Pure maths). For images use a kernel/mask - derivative mask to calculate the derivative of an image.

Question 2

Q

Kernels

Answer

A

mask, kernel and filter are interchangeable. A square matrix of numbers used to compute various properties or characteristics in an image e.g. for edge detection/image blurring.

Question 3

Q

Convolution

Answer

A

In the context of image processing is defined as the sum of the product of the corresponding elements of a kernel matrix to an image matrix. For all the pixels in the original image = image convolution

Question 4

Q

Gaussian blur

Answer

A

One of the most used filters in image processing. It uses the Gaussian distribution bell curve. When we create a kernel that follows a Gaussian distribution, the centre pixel gets the most weight and its neighbouring pixels get lesser weight when performing convolution. The pixel which has to be modified will have the highest weight in the kernel and the weight decreases for the pixels which are far away. The Gaussian distribution formula is a continuous function whereas images are discrete. Hence, we discretize the values from the Gaussian distribution before making a kernel matrix out of it.

from PIL import IMAGE
from PIL import ImageFilter
img=Image.open("image.png")
blur_img = img.filter(ImageFilter.GaussianBlur(5))
blur_img.show()

_____

from skimage import filters
img=io.imread("image.png")
out=filters.gaussian(img, sigma=5)
io.imshow(out)
io.show()

Question 5

Q

Morphological operations

Answer

A

Operations that use the inherent structure or features of an image and processes the image while maintaining the overall structure. Examples: erosion and dilation

Question 6

Q

Erosion

Answer

A

Remove parts of the image. Applying erosion to an image makes the objects in the image shrink while maintaining the overall structure and shape of the image.

from skimage import morphology
from skimage import io
img=io.imread('image.png')
eroded_img = morphology.binary_erosion(img)
io.imshow(eroded_img)
io.show()

Question 7

Q

Dilation

Answer

A

The opposite of erosion. Expand the parts of the image. Useful for magnifying small details of the image, and where you want to fill up unwanted gaps/holes in an image.

from skimage import morphology
from skimage import io
img = io.imread('image.png')
dilated_img = morphology.binary_dilation(img)
io.imshow(dilated_img)
io.show()

Question 8

Q

Image thresholding

Answer

A

Thresholding in image processing means to update the colour value of a pixel to either white or black according to a threshold value. If the pixel value is greater than the threshold value, then set the pixel to white, otherwise set it to black. Inverse thresholding is where we flip greater than to lesser than and everything else remains the same.

from skimage.filters import threshold_otsu, threshold_adaptive
from skimage.io import imread, imsave
from skimage.color import rgb2gray

img=imread(‘image.png’)
img=rgb2gray(img)
thresh_value = threshold_otsu(img)
thresh_img = img > thresh_value

Question 9

Q

Image features

Answer

A

Image gradients and edges give us info about shapes of dif objects in the image. But these are weak features and cannot be relied upon all the time - v sensitive to variations in brightness, contrast and backgrounds. Need features that are more stable - more sophisticated feature descriptors such as corners, Local Binary Pattern (LBP), BRISK, Oriented FAST and Rotated BRIEF.

A good feature descriptor should be invariant to changes in scale, rotations and translations. More robust to variations in the image.

With ML, use NNs to extract features from an image.

Question 10

Q

Scaling with OpenCV

Answer

A

To resize the image, OpenCV has a resize() function, which takes the image, dimensions and interpolation algorithm as input.

Interpolation algorithms:

cv2.INTER_AREA - preferred for shrinking the image
cv2.INTER_CUBIC - preferred for zooming (slow)
cv2.INTER_LINEAR - preferred for zooming
cv2.INTER_LINEAR - default

import cv2
img = cv2.imread('image.png')
r, c= image.shape[:2]
new_img = cv2.resize(img, (2*r, 2*c), interpolation=cv2.INTER_CUBIC)
cv2.imwrite('resize_image.png', new_img)
cv2.imshow('resize', new_img)

Question 11

Q

Cropping the image with OpenCV

Answer

A

Done by slicing the image array. Slicing an array is just taking the array values within particular index values.

import cv2
img=cv2.imread('image.png')
img_crop=img[0:200, 150:350]
cv2.imwrite('crop_img.jpg', img_crop)
cv2.imshow('crop', img_crop)

Question 12

Q

Template matching

Answer

A

locate a template image in an image. The cv2.matchTemplate() function iterates over the image and compares the input with template to find the match. The cv2.minMaxLoc() will give you the location of the best match. max_loc gives the coordinates of the top left corner of the rectangle. To find the bottom right coordinates, add the width and height of the template image to the top left coordinates.

import cv2
img=cv2.imread(‘image.jpg’)
gray=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_temp=cv2.imread(‘template.jpg’)
gray_temp=cv2.cvtColor(img_temp, cv2.COLOR_BGR2GRAY)

w, h = gray_temp.shape[::-1]
output=cv2.matchTemplate(gray, gray_temp, cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(output)

top=max_loc
bottom = (top[0]+w, top[1] +h)
cv2.rectangle(img, top, bottom, 255, 2)
cv2.imshow('image', img)
cv2.imwrite('img.jpg', img)

Question 13

Q

Laplacian of Gaussian

Answer

A

used to precisely detect edges in an image. LoG - calculate the second order derivative of the image, which locates all the edge and corner points in the image that will be used as potential keypoints of the image. Since the second order derivative is extremely sensitive to noise, Gaussian Blur helps in stabilizing the derivative.

Computer Vision Flashcards

(13 cards)