Video Processing Flashcards
Video Processing:
What is Video?
- sequence of frames captured over time
- image data is a function of space and time
(Motion Analysis) What information can be extracted from time varying sequences of images?
– Camouflaged objects easily visible when
moving
– Size and position of objects are more easily
determined when the objects move
– Even simple image differencing provides edge
detector for objects moving over any static
background
Into how many stages can analysis of visual motion be divided?
2
– measurement of the motion
– use of motion data to segment scene into objects and
to extract information about shape and motion.
How many types of motion are there to consider?
– movement in the scene = static camera,
– movement of the camera = ego motion.
* should be the same (motion is relative) but not always true -
if scene moves illumination, shadow and specular effects
need to be considered
2
– static camera=movement in the scene
– ego motion = movement of the camera
It should be the same (motion is relative) but not always true -
if scene moves:
-illumination, shadow and specular effects
need to be considered
In what are we interested in? (Optical Flow and Motion)?
We are interested in finding the movement of scene objects from time-varying images (videos).
Which are the uses of motion analysis?
-Track object behavior
-Align images (mosaics)
-3D shape reconstruction
-Correct for camera jitter (stabilization)
-Special effects
What is the Motion field? And
-projection of the 3d scene motion into the image
- length of flow vectors inversely proportional to depth Z of 3D point (points closer to the camera move quickly across the image plane
What is the Optical flow?
- apparent motion of brightness patterns (or colors) in the image
Ideally, optical flow should be the same as the motion field?
Yes
How can apparent motion be caused?
It can be caused by lighting changes without any actual motion.
What is necessary to estimate pixel motion from image?
We have to solve the pixel correspondence problem.
What is the pixel correspondence problem?
Given a pixel in frame t, look for nearby pixels
with same characteristics (colour, brightness, …)
in frame t − 1.
How to estimate pixel motion from image H to image I? (pixel correspondence problem)
We need to find pixel correspondences:
- Given a pixel in H, look for nearby pixels of the same
color in I
What are the key assumptions in order to solve the pixel correspondence problem?
-color constancy: a point in H looks “the same” in image I
(for example: In grayscale images, this is brightness constancy)
– small motion: points do not move very far
What is the Lucas Kanade method (most popular Optical
Flow Algorithm) main goal? And how do you achieve it?
To get more equations for a pixel
The basic idea is to: impose additional constraints
What is the most common Lucas Kanade aditional constraint?
It is to assume that the flow field is smooth locally by pretending the pixels neighbors have the same (u,v).
But If we use a 5x5 window, that gives us 25 equations per pixel!
What is the problem that comes from the most common Lucas Kanade aditional constraint?
Having more equations than unknowns.
How to solve the problem that comes from the most common Lucas Kanade aditional constraint?
Solve least squares problem.
-The sum are over all pixels in the K x K window
When is the problem that comes from the most common Lucas Kanade aditional constrain solvablet?
A^T * A should be :
-invertible
- not to small due to noise (λ1 and λ2 should not be too small)
- well-conditioned (λ1 and λ2 should not be too large (λ1 = larger)
What is the Algorithms goal in Optical Flow?
They try to approximate the true motion field of the image plane.
What is background subtraction?
-it allows looking at video data as a spatio-temporal volume.
- if camera is stationary, each line through time corresponds to a single ray in space
When are Background subtraction techniques commonly used for?
They are commonly used for segmenting out objects of interest in a static camera scene:
- surveillance
- robot vision
- object tracking
- traffic applciations
- human motion capture
- augmented reality
How does the background subtraction allow the segmentation of objects of interest in a static camera scene?
Trough the foreground mask (binary image) that it creates, containing moving objects in static camera setups:
- subtracting the observed image from the estimated image and thresholding the result
What is Foreground detection?
How the object areas are distinguished from the background
What is Background maintenance?
How the background is maintained over time
What is Post-processing?
How the segmented object areas are detected
What is the generic algorithm for a Static Background?
- create an image of the stationary background
- subtract current frame and known background frame - motion detection algorithms - these only work if the camera is stationary and the objects are moving against a fixed background
What is the generic algorithm with Frame differencing?
- background is estimated to be the previous frame
- depending on the object structure, speed, frame rate and global - threshold may be useful (usually is not)
What is another approach is to model the background?
It is by using a running average. A pixel is marked as foreground.
The thresholding (th) is predefined and often followed by morphological closing with a 3x3 kernel and the discarding of small regions
What is another approach is to model the background?
It is by using a running average. A pixel is marked as foreground.
The thresholding (th) is predefined and often followed by morphological closing with a 3x3 kernel and the discarding of small regions
In the background update why is α kept small?
In onder to prevent artificial tails forming behind moving object
What is Tracking?
- crucial issue in CV
- could compute optical flow from one to another but flow only reliable for small motions
Where can tracking be apllied?
-Body pose tracking, activity recognition
-Censusing a bat population
-Video-based interfaces
-Medical apps
-Surveillance
In more than just a pair of frames, why can’t we use optical flow from one to the other?
Because flow is only reliable for small motions, and we may have:
-occlusions;
-texture less regions that yield bad estimates
What is the difference between diference and tracking?
- Detection: we detect the object independently in each frame and can record its position over time
- Tracking: we use image measurements to estimate the position of the object but also incorporate position predicted by dynamics
What defines the Kalman filter?
- hidden state consists of the true parameters we care about
- the measurement is our noisy observation
- at each step, state changes and we get a new observation
What form do the predicted/corrected state distributions have?
Gaussian
What are the only parameters that need to be maintained in the Kalman Filter?
The mean and covariance
What is the method for tracking linear dynamical models in gaussian noise?
The Kalman Filter