Week 6 Flashcards
In the Facial Action Coding System, what are action units and what are they for?
1) smallest visually discriminable facial movements, represent muscular activity that produces changes in facial
appearance
2) For detecting & measuring a large number of facial
expressions via a small set of AUs
3) Accurate detection of AUs depends upon proper
identification & tracking of different facial muscles
irrespective of pose, face shape, illumination & image
resolution
4) Detection of all facial fiducial points is even more
challenging than expression recognition itself
Difference between additive and non-additive AUs
Additive - appearance of each AU is independent
Non-additive - AUs modify each others performance
In the context of FACS, what is an event?
1) A set of AUs that overlap in time & defines a perceptually
meaningful unit of facial action
2) Constitute a single display
3) Guiding assumption, facial behaviour occurs not continuously but rather as
episodes (events) that manifest themselves as discrete events, AUs that occur together are related & form an
event
4) Event coding can be more efficient than single AUs
5) Addresses problem that some AUs may linger & merge into the background
Why is a prior model required in AU recognition?
1) Most current AU recognition techniques ignore semantic
relationships among AUs, & dynamics of AUs
2) Prior models of spatial-temporal relationships among AUs with image measurements ⇒ robust AU recognition
3) Especially useful when image measurement is unreliable
4) Use knowledge-driven method to learn a prior AU model from different types of qualitative knowledge, therefore no training data required which introduces unreliability of manual scoring, and built in database bias - cannot generalise
What is a Bayesian Prior model, illustrating it with an example?
1) BN - directed acyclic graph (DAG) that represents a joint
probability distribution among a set of random variables
2) Can be used as prior model to capture AU knowledge
3) Prior model probabilistically encodes constraints to capture AU
occurrence frequency
What is dynamic Bayesian Network model, illustrating it with an example?
Models temporal evolution of a set of random variables X over time to capture dynamic dependencies
What is the motivation for using a 2D Gabor filter for image measurement?
1) good models of receptive fields of a large amount of cells located in the
mammalian Primary Visual Cortex
2) invariant to translation, scale &
rotation
3) band-pass filters, i.e., pass frequencies within a certain frequency range
4) Multiresolution (i.e., multiscale) analysis compatible form
What is a Gabor filter?
product of a sinusoidal plane wave & a bivariate elliptic Gaussian
Process of AU measurement extraction
1) Detect eyes through a boosted eye detector
2) Normalise image into 64x64 sub-image based on eye positions
3) Apply a set of 6 orientations & 5 scales Gabor filters to give a
6x5x64x64 = 122,880 dimension feature vector for each image.
4) Use AdaBoost classifier to obtain measurement for each AU:
- In training, increase weights of wrongly classified examples in
each iteration to force AdaBoost to focus on the most difficult
samples in training set
- Utilises around 200 Gabor features for each AU
5) Based on the image measurement ei & ground truth AUi, train a
likelihood function that is a conditional probability of the AU measurement
given the actual AU values,
6) need training data
Outline the principles of an AdaBoost classifier
1) Creates accurate prediction rule by combining many relatively weak & inaccurate rules
2) Combines properties of an efficient classifier & feature selection
Two types of facial expression recognition
Geometric based and appearance based
Geometric based expression recognition
1) tracks shape & size of face, and facial components
2) categorises expressions based on relative position of facial components
3) shape models based on characteristic points on face require: accurate detection & tracking of facial landmarks
4) but distance between facial landmarks vary from person to person
Appearance based expression recognition
1) facial expressions involve change in local texture
2) uses a bank of filters, e.g., Gabor wavelets, Local Binary Pattern (LBP), to
encode texture
3) high-dimensional features ⇒ applies dimensionality reduction techniques,
e.g., principal component analysis (PCA), linear discriminant analysis
(LDA)
4) preservation of discriminative information ⇒ popular approach
Outline an algorithm for detecting eyebrow corners
1) Use positions of eyes to select coarse ROIs of eyebrows
2) Detect eyebrows using method similar to upper lip detection
3) Perform adaptive thresholding before applying horizontal Sobel operator
to improve accuracy of eyebrow corner localisation
4) Use horizontal edge detector to reduce false detection of eyebrow
positions due to partial occlusion by hair
Outline an algorithm for detecting lip corners given aligned face ROI and nose position
1) Select coarse lips ROI using face width & nose position
2) Apply Gaussian blur to the lips ROI (to remove noise)
3) Apply horizontal Sobel edge detector (to detect upper lip)
4) Apply Otsu-thresholding (automatic clustering based image thresholding)
to remove spurious edges
5) Apply morphological dilation operation (to close gaps along edges)
6) Find the connected components
7) Remove spurious connected components with thresholding
8) Scan image from top & select the largest connected component below
nose as upper lip
9) Locate the left- & right-most positions of connected component as lip
corners