Computer Vision Flashcards
Image derivatives
An image derivative is defined as the change in the pixel value of an image. The rate of change of a function is defined by the equation used for differentiating from first principles (AS Pure maths). For images use a kernel/mask - derivative mask to calculate the derivative of an image.
Kernels
mask, kernel and filter are interchangeable. A square matrix of numbers used to compute various properties or characteristics in an image e.g. for edge detection/image blurring.
Convolution
In the context of image processing is defined as the sum of the product of the corresponding elements of a kernel matrix to an image matrix. For all the pixels in the original image = image convolution
Gaussian blur
One of the most used filters in image processing. It uses the Gaussian distribution bell curve. When we create a kernel that follows a Gaussian distribution, the centre pixel gets the most weight and its neighbouring pixels get lesser weight when performing convolution. The pixel which has to be modified will have the highest weight in the kernel and the weight decreases for the pixels which are far away. The Gaussian distribution formula is a continuous function whereas images are discrete. Hence, we discretize the values from the Gaussian distribution before making a kernel matrix out of it.
from PIL import IMAGE from PIL import ImageFilter img=Image.open("image.png") blur_img = img.filter(ImageFilter.GaussianBlur(5)) blur_img.show()
_____
from skimage import filters img=io.imread("image.png") out=filters.gaussian(img, sigma=5) io.imshow(out) io.show()
Morphological operations
Operations that use the inherent structure or features of an image and processes the image while maintaining the overall structure. Examples: erosion and dilation
Erosion
Remove parts of the image. Applying erosion to an image makes the objects in the image shrink while maintaining the overall structure and shape of the image.
from skimage import morphology from skimage import io img=io.imread('image.png') eroded_img = morphology.binary_erosion(img) io.imshow(eroded_img) io.show()
Dilation
The opposite of erosion. Expand the parts of the image. Useful for magnifying small details of the image, and where you want to fill up unwanted gaps/holes in an image.
from skimage import morphology from skimage import io img = io.imread('image.png') dilated_img = morphology.binary_dilation(img) io.imshow(dilated_img) io.show()
Image thresholding
Thresholding in image processing means to update the colour value of a pixel to either white or black according to a threshold value. If the pixel value is greater than the threshold value, then set the pixel to white, otherwise set it to black. Inverse thresholding is where we flip greater than to lesser than and everything else remains the same.
from skimage.filters import threshold_otsu, threshold_adaptive
from skimage.io import imread, imsave
from skimage.color import rgb2gray
img=imread(‘image.png’)
img=rgb2gray(img)
thresh_value = threshold_otsu(img)
thresh_img = img > thresh_value
Image features
Image gradients and edges give us info about shapes of dif objects in the image. But these are weak features and cannot be relied upon all the time - v sensitive to variations in brightness, contrast and backgrounds. Need features that are more stable - more sophisticated feature descriptors such as corners, Local Binary Pattern (LBP), BRISK, Oriented FAST and Rotated BRIEF.
A good feature descriptor should be invariant to changes in scale, rotations and translations. More robust to variations in the image.
With ML, use NNs to extract features from an image.
Scaling with OpenCV
To resize the image, OpenCV has a resize() function, which takes the image, dimensions and interpolation algorithm as input.
Interpolation algorithms:
- cv2.INTER_AREA - preferred for shrinking the image
- cv2.INTER_CUBIC - preferred for zooming (slow)
- cv2.INTER_LINEAR - preferred for zooming
- cv2.INTER_LINEAR - default
import cv2 img = cv2.imread('image.png') r, c= image.shape[:2] new_img = cv2.resize(img, (2*r, 2*c), interpolation=cv2.INTER_CUBIC) cv2.imwrite('resize_image.png', new_img) cv2.imshow('resize', new_img)
Cropping the image with OpenCV
Done by slicing the image array. Slicing an array is just taking the array values within particular index values.
import cv2 img=cv2.imread('image.png') img_crop=img[0:200, 150:350] cv2.imwrite('crop_img.jpg', img_crop) cv2.imshow('crop', img_crop)
Template matching
locate a template image in an image. The cv2.matchTemplate() function iterates over the image and compares the input with template to find the match. The cv2.minMaxLoc() will give you the location of the best match. max_loc gives the coordinates of the top left corner of the rectangle. To find the bottom right coordinates, add the width and height of the template image to the top left coordinates.
import cv2
img=cv2.imread(‘image.jpg’)
gray=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_temp=cv2.imread(‘template.jpg’)
gray_temp=cv2.cvtColor(img_temp, cv2.COLOR_BGR2GRAY)
w, h = gray_temp.shape[::-1]
output=cv2.matchTemplate(gray, gray_temp, cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(output)
top=max_loc bottom = (top[0]+w, top[1] +h) cv2.rectangle(img, top, bottom, 255, 2) cv2.imshow('image', img) cv2.imwrite('img.jpg', img)
Laplacian of Gaussian
used to precisely detect edges in an image. LoG - calculate the second order derivative of the image, which locates all the edge and corner points in the image that will be used as potential keypoints of the image. Since the second order derivative is extremely sensitive to noise, Gaussian Blur helps in stabilizing the derivative.