Week 5 Flashcards

1
Q

How does the 3-dimensional human observer-based approach to emotion measurement differ from message-based measurement and sign-based measurement? State the advantages and disadvantages of the 3-dimensional approach.

A

1) emphasises similarities between emotions
2) represents emotion in terms of 2 or 3 underlying dimensions:
- pleasantness-unpleasantness vs attention-rejection/arousal-sleepiness
- dominance-submissiveness as third dimension
3) advantages:
- positive & negative affects measured over intensity ranges of hundreds of points, requiring little expertise
- for multiple independent & unbiased ratings: scores
aggregated across multiple raters
4) disadvantages:
- not well suited to representing discrete emotions
- assume emotion may be inferred from facial expressions
- signal involved in communicating emotion are unspecified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the challenges of automated face analysis for emotion recognition?

A

1) non-frontal pose, & moderate to large head motion ⇒ difficult image registration
2) many facial actions are inherently subtle ⇒ difficult to model
3) temporal dynamics of actions highly variable
4) discrete AUs can modify each other’s appearance
5) individual differences in face shape & appearance
⇒ difficult to generalise
6) classifiers can suffer from over fitting when trained with insufficient
examples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Recent work on expression detection in naturalistic settings

A

1) partial occlusion
2) pose variation
3) rigid head movement
4) lip movements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pipeline for AFA

A

1) input image/vid
2) facial landmark detection/tracking
3) face alignment
4) feature extraction
5) dimensionality reduction
6) action unit classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the purpose of image registration in emotion recognition?

A

1) To remove effects of spatial variation in face position, rotation, & facial proportions
2) ⇒ Register images to size & orientation in the canonical perspective (our preferred way of viewing an object)
3) 3D transformation estimated from monocular (up to a scale
factor) or multiple cameras using structure from motion
algorithms
4) For small to moderate out-of-plane rotation a moderate distance from the camera - 2D projected motion field of a
3D planar surface can be recovered with an affine model of 6 parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is meant by appearance features and how can they be represented?

A

1) changes in skin texture, e.g., wrinkling
2) simplest - a vector of raw pixel-intensity values
problem - lighting conditions affect texture
3) Gabor wavelets or magnitudes, histogram of oriented
gradients (HOG), & Scale Invariant Feature Transform
(SIFT) - more robust to registration error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

With the aid of a figure explain what is meant by the brightness consistency
constraint that is exploited in optical flow for motion estimation.

A

any differences in image brightness for corresponding points in two image
frames denote displacement and thus motion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the three types of supervised learning used in automated face analysis
for emotion recognition.

A

1) Event categories (e.g., emotion labels or AUs) or
dimensions defined in advance in labelled training data
2) Static modelling
- each video frame is evaluated independently
- uses NNs, SVM classifiers, boosting
3) Temporal modelling
- frames are segmented into sequences
- modelled with a variant of dynamic Bayesian networks, e.g., hidden Markov models (HMMs)
- uses HMMs to temporally segment actions by establishing a correspondence between the action’s onset, peak &
offset, and an underlying latent state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the challenges encountered when creating a facial expressions database?

A

1) Variations in video: pose, illumination, resolution, occlusion, facial expression, actions (intensity & timing),
individual differences in subjects
2) Most have used directed facial action tasks:
difference between posed and spontaneous FA’s (complicates pattern recog approach e.g. HMMs),
holistic expressions
3) Coder variability:
- “test-retest” unreliability, assign different AUs to same
segment on different occasions
-“alternate-form” unreliability, different coders may assign
different AU
- ⇒ coders should be certified to minimise errors
- error due to manual data entry
- error in “ground truth” adversely affects classifier training &
performance
- difference in manual coding between databases
⇒ impaired generalizability of classifiers from one database to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly