wk4 Flashcards
What is the issue with treating vision as an objective medium
-Vision is inferential and highly dependent on context
-changes in illumination can shift perceived colour of surrounding images
What should selected features be invariant with respect to
- illumination
- scale
- rotation
-affine (stretching of image) - perspective
How do we help with illumination, scale and rotation invariance
illumination: histogram normalization or difference-based shift
Scale: change images to smaller versions via average or mean filter (similar to max pooling)( or use Difference of Gaussian
Rotation: rotate all features in same direction by taking the histogram of all directions and rotate all features to the most dominant direction
How could you compress a dynamic video
look for static elements in the video e.g. background which doesn’t change and make the pixels static
what is optical flow
visualisation of motion between two images:
-identifies features in t and t+1 which correspond to one another
-plots a vector of movement of each pixel or feature from t to t+1
What is a simplistic way to track motion in two images t and t+1
(t+1) - (t) -> if the resulting pixel is greater than some threshold, motion. Else nothing
What is the problem with simply subtracting two images to determine motion and what should the solution be
-Two images could be noisy
-noise (being random by nature) will not be correspondent between the two images
-hence subtracting the two images will show differences in noise mistaken for motion
-gaussian filtering will suppress features so is not appropriate
-solution is using connectedness
What is 4 and 8 neighbour and connectedness
4 neighbour:
- a feature is only a feature if it shares an adjacent pixel with 4 other features
8 neighbour:
- feature is a feature if it shares a common corner with 8 other features
Connectedness:
-P and Q are connected if a path from P to Q can be joined by a series of pixels such that the path contains 8 adjacent pixels
What is the aperture problem
If we only have access to say a slit or a pinhole camera so we see only local features of an image, we cant truly determine the overall motion of an image, directionality nor proper features when looking locally
As a solution we shouldn’t look at local features but instead identify interesting features
What interesting feature should we investigate rather than edges
Corners
which detector is used for detecting corners
Moravec operator
How does Moravec operator work
-Take a filter e.g. 5x5, 7x7 etc.
- Place this filter on some point in the image call it filter A
- Shift the filter in each of the component directions (left, right, up, down, the 4 verticals) call this filter B
- for each B_i where i =[0,8]. Take (A-B)^2
- sum all these differences
- threshold the resulting pixels, the largest ones will be those with the greatest magnitude change in all directions which corresponds to corners
what are the 3 principles of motion correspondence
- Distinctiveness: individual points must be distinct from one another
- Similarity: two points should resemble each other if they are the same point affected by motion i.e. point 1 in t and t+1 should look similar
- Consistency: two matches should have moved in analogous ways to other matches
outline of the algorithm for motion correpsonence
-find points of interest using Moravec
-pair features of img1 at t and img2 at t+1
- calculate the degree of similarity between each point of interest ,i, in img1 and img2
-calculate the likelihood of each match by calculating similarity weights and converting them to probabilities
-do this for each patch in img1 to each patch in img2, highest probability = most likely match
what is the equation for similarity weight from patch i to patch j where patch i is in some image at t and j is in some image at t+1
w_i,j = 1 / 1 + (alpha) S_i,j
where S_i,j is similarity between i and j which is equal to patch_j - patch i
and alpha is a constant