Introduction Flashcards
What is computer vision?
Computer vision is the study that has as goal to create software that can interpret images and give back information about what the image represents.
What is an image mathematically speaking?
An image is a function I(x,y) that gives back the intensity at position (x,y).
I: R² -> R, where R usually is a value in a discrete range e.g., [0, 255].
Name at least six uses of computer vision.
Optical Image Recognition (OCR) - Used to read characters from an image
Facial recognition - Used in cameras with smile shutters and to unlock your phone
Object recognition - Detect objects in images, can be used to detect theft in stores
3D modelling - Convert images into 3D models
Motion capture - Project images onto moving entities (Davy Jones in Pirates of the Caribbean)
Structure from motion - Turns a series of 2D images into 3D images
Smart cars - Computer vision is used in car collision detection systems
Sports - Detect who’s first at the finish line
Vision based interaction - Computer vision is used in the Wii controller and in Kinect
Security / Surveillance - Computer vision is used by security cameras to detect thieves
Medical imaging - Computer vision and augmented reality can help surgeons operate
Define a color image as a vector-valued function.
I(x,y) = [ r(x,y) g(x,y) b(x,y) ]
What is noise in an image mathematically speaking?
Noise is just like an image, a function, η(x,y). An image with noise can be defined as the sum of noise and the original image.
I’(x,y) = I(x,y) + η(x,y)
What is salt and pepper noise?
Salt and pepper noise as a function returns black and white pixels at random positions. The noise is sparsely distributed.
What is impulse noise?
Impulse noise as a function returns white pixels at random positions.
What is Gaussian noise?
Gaussian noise as a function returns intensities that are picked from a normal distribution.
What makes computer vision difficult?
Viewpoint - It’s difficult for computer vision algorithms to deal with objects seen from an unfamiliar angle.
Illumination - Light in different levels of brightness and from different angles can complicate the detection of features.
Scale - The distance between the camera and object can make the same object to be of different sizes.
Motion - Moving objects or cameras can complicate matters.
Intra class variation - The object you are looking for may appear in different colors or shapes (there are many car types for example).
Occlusion - There may be objects in front of the objects you are trying to identify.
Background clutter - The object you are looking for might disappear in the background if they look very similar (lack of contrast.
Local ambiguity - A feature may be present in multiple objects in an image.
What does sigma determine in the context of Guassian noise?
Sigma is the factor that gets multiplied with the Guassian kernel, a larger sigma value will result in more visible noise in the resulting image.
What does sigma determine in the context of a Guassian smoothening filter?
Sigma determines the standard deviation of the kernel, a larger sigma value will result in more blur in the resulting image.
Why are non-uniform Gaussian kernels preferred over uniform kernels when smoothening images?
When using a Gaussian kernel, the center pixel and most nearby pixels will contribute the most to the average. This will result in a smoother looking average or blur.
Give an example of a Gaussian kernel.
[ 1 2 1
2 4 2 * 1/16
1 2 1 ]
What is a linear operator?
An operator is linear if two properties hold:
Additivity: H( f1 + f2 ) = H( f1 ) + H( f2 )
Multiplicative scaling: H( a * f1 ) = a * H( f1 )
What’s the difference between cross-correlation (G = H ⊗ F) and convolution (G = H * F)?
Convolution is very similar to cross-correlation, but in convolution the kernel will get flipped by 180 degrees.