Week 7 Flashcards

1
Q

Applications of AFA for detection of physical pain

A

1) subjective
2) patient self-report with limitations:
- idiosyncratic
- susceptible to suggestion
- deception
- unsatisfactory with people incapable of articulating their feeling
3) Using behavioural measures ⇒ facial indicators of pain: brow lowering (AU4), orbital tightening (AU6 & AU 7), nose wrinkling & lip raise (AU9 & AU10), eye closure (AU43)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How could automated face analysis aid in diagnosing and treatment of depression
and psychological distress?

A

1) facial expression & other nonverbal communication
⇒ indicators for disorder severity & response to treatment
2) depressed individuals:
look less at conversation partners, gesture less
smile less, more smile suppressor movements,
less facial animation
3) replicated using AFA ⇒ useful for screening efforts in mental health
4) screening

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what does polygraph do?

A

monitors uncontrolled changes in heart rate & electro-dermal

response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the limitations of polygraph?

A

1) continuously connected to subject’s body
2) requires accurate calibration to establish baseline measurements
3) an overt system, i.e., subject aware
4) requires a trained operator, who controls the likelihood of human error/length of interview, get tired

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can thermal imaging address limitations of polygraph?

A

1) Computes mean temperature of the 10% hottest pixels from within the periorbital region of interest (ROI)
- mean temperature on the vasculature in inner corners of the eyes
2) consists of:
- Low varying component indicative of long term trend of blood flow
levels
- Mid frequency component, which is associated with temporary
disturbances in blood flow caused by stress in specific Question &
Answer (Q & A) sessions
- High frequency component caused by tracker instability & systemic
noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how does stress and deceptive behaviour manifest during interrogation?

A

in peripheral senses through various
physiological signatures, e.g., perspiration, pulse, &
breathing rate - basis of polygraph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does thermal imaging of facial physiology work?

A

1) Correlation of increased blood perfusion in the orbital
muscles & stress levels
2) Periorbital perfusion can be quantified via the processing of thermal video based on:
- skin temperature is modulated by superficial blood flow
- heat convected by blood flow in ophthalmic arteriovenous complex is responsible for elevated temperature w.r.t. the rest of the periorbital region.
- supply of additional blood to eye muscle is realised through this complex
- ⇒ monitor the conduit in the eye corners to detect stress

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can the video-based detection of head motion, facial expression and body
motion be used to recognise deception?

A

1) tracks movements of subjects’ hands & head relative to their body
2) analyses some facial expressions & their 3D head pose ⇒ motion profile
3) All motion is histogrammed into 5 bins, with each bin having an exponentially increasing size
i. e., bin 1 covers a very small range; bin 5 covers the largest range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can interpersonal coordination between a mother and an infant be monitored?

A

1) pattern of association for head motion and AUs between mothers
& infants - non stationary (i.e., mean & variance of the underlying
process are not constant)
2) non-stationarity in head pose coordination of distressed intimate
adults
3) head amplitude and velocity for pitch (nod) & yaw (turn) - strongly
correlated between them, with alternating periods of instability (low correlation) followed by brief stability in which one or the other
partner led the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe windowed cross correlation between mother and infant head-pitch amplitude

A

1) Area above midline (Lag > 0) - relative magnitude of correlations for which the
mother’s head amplitude predicts her infant’s
2) Area below midline (Lag < 0) - the converse
3) Midline (Lag = 0) - both partners change their head amplitudes at the same time
4) Positive correlations (red) - head amplitudes of both partners change in the
same way (i.e., increasing together or decreasing together)
5) Negative correlation (blue) - head amplitudes of both partners change in the
opposite way (e.g., head amplitude of one partner increases as that of the other
partner decreases)
6) Direction of the correlations changes dynamically over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can automated face analysis be used in marketing?

A

Using web-cam technology to record thousands of viewers in dozens of countries, and to process their facial
expression to infer liking or disliking of commercials &
products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can automated face analysis be used in instructional technology?

A

1) Interest, confusion, rapport, frustration, and other emotion & cognitive-emotional states - process variables in classroom & in tutoring
2) Ability to distinguish between closely related facial actions that signal student’s cognitive-emotional states

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can automated face analysis be used in computational behavioural science?

A

1) In conversation, expectations about another person’s identity are closely involved with his or her actions
2) Over telephone, inferences are made from the sound of the voice about the other person’s gender, age & background
3) An individual has a characteristic & unified appearance,
head motions, facial expressions & vocal inflection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can automated face analysis be used in media arts?

A

1) Widely used in the entertainment industry
2) Movies, e.g., Avatar & Hobbit
3) Gaming, e.g., Sony’s Everquest II

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how does body better communicate some affective expressions

A

body posture communicates cause of a threat & the following action, whereas face communicates only the threat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Discuss the advantages and disadvantages optical motion capture

A

1) accurate numeric representation of body in 3D space
2) anonymous data
3) not portable
4) marker occlusion
5) high cost

17
Q

Electromechanical motion capture system

A

1) uses potentiometers on plastic exoskeleton that the subject wears
2) exoskeleton tracks human joints &
angles between body segments to
reconstruct the 3D body

18
Q

Electromagnetic motion capture system

A

1) electromagnetic sensors placed on body
2) measures orientation & position of sensors relative to electromagnetic
field generated by a transmitter to reconstruct the 3D body
3) no problems with occlusion, but EM interference

19
Q

Benefits of electromechanical and electromagnetic systems

A

1) accurate numeric representation of
body in 3D space
2) anonymous data ⇒ privacy
3) portable ⇒ can be used in almost any setting, indoors or outdoors
4) much cheaper than optical motion capture system

20
Q

Markerless based-vision systems

A

1) use video or web cameras to record movement
2) no mobility issues
environmental conditions pose challenges:
variations in lighting, skin colour & clothing body part occlusion or touching
3) non-intrusive

21
Q

Describe the operation principles of the Microsoft Kinect sensor, and how it could be used to recognise body expressions.

A

1) a depth sensor & a colour camera
2) depth sensor: IR projector & IR camera (a monochrome CMOS sensor):
- based on structured light principle
- IR projector - IR laser that passes through a diffraction grating & turns into a set of IR dots
- known relative geometry between IR projector, IR camera & projected IR dot pattern
- compute depth map using 3D triangulation of corresponding dots in
image & projector pattern

22
Q

What are the cross-cultural similarities and differences in perceiving affect through
the human body?

A

1) differences due to social status being more important to Japanese (than American)
2) similarities found: sad/depressed postures (Japanese, Sri Lankan, American)
3) differences in assigning intensity ratings to emotions:
Japanese assigned higher intensity (Japanese, Sri Lankan, American)
4) dimensions required for seated posture:
- Japanese - 3 (arousal, valence, & dominance)
- British - 2 (arousal, valence)

23
Q

Discuss the usefulness of body features for emotion recognition

A

1) motion signals - sufficient for recognition
2) recognition accuracy is impaired when form information is
disrupted, e.g., by inverting or reversing the motion
3) posture cues aids in discriminating between emotion with similar dynamic cues or movement activation
4) dynamic information - complementary & partially redundant
to form
5) possible to classify numerous affective behaviour using only
upper-body features
5) meaningful groups of emotions could be clustered in 1/4 quadrants of valence/arousal plane (Glowinski 2011)

24
Q

State 4 groups of body expressions in Dael, Mortillaror & Scherer (2012) coding system that are useful in distinguishing between emotions perceived via body action and posture.

A

1) Head orientation
2) Head posture, e.g., lateral head turn towards left position
3) Trunk orientation, i.e., facing & averted
4) Trunk posture, e.g., trunk lean towards a forward position
5) Whole body posture, e.g., whole body moves or leans towards a backward position
6) Arms posture, e.g., left arm at side
7) Gaze, i.e., toward, upward, downward, averted sideways, eyes
closed
8) Head action, e.g., upward head tilt
9) Trunk action, e.g., spine bending
10) Arm action, e.g., left arm action away from the body
11) Other, i.e., touch, knee bend & leg movement
12) Action function, e.g., beat (i.e., repetitive action that accentuates
points in time, illustrating structural or rhythmic aspects of co-occurring speech)

25
Q

Cues for distinguishing emotions (Coulson 2004)

A

1) using avatars & body description of 6 joint rotations (e.g.,
abdomen twist, head bend, etc.)
2) ⇒ agreement between observers for angry, happy & sad postures

26
Q

Cues for distinguishing emotions (De Meijer 1989)

A

1) considered 7 movement dimensions (e.g., arm opening & closing, fast to slow velocity of movement, etc.)
2) rated dancers’ movements according to their compatibility with 9 emotions
3) ⇒ trunk movement is most predictive for all emotions except anger
4) ⇒ distinguish between positive & negative emotions

27
Q

Cues for distinguishing emotions (De Silva & Bianchi-Berthouze 2014)

A

1) used 24 features to describe upper-body joint positions & orientation of shoulders, head & feet to analyse affective postures
2) 2 to 4 PCA principal components covered ≈ 80% of the variability in form

28
Q

Cues for distinguishing emotions (Roether 2009)

A

1) to extract & validate the minimum set of spatiotemporal motor
primitives that drives perception of particular emotions in gait
2) by creating walking patterns that reflect these primitives ⇒ perceive emotions via specific changes of joint-angle amplitudes

29
Q

Considerations for building ground truth

A

1) Choose emotion n framework from either:
- discrete model: most adopted
- continuous model: more comprehensive description
2) Define labelling process:
- expresser’s self-reported affective label - not feasible or reliable
- use experts or naive observers to label the affective state conveyed by a body expression, but
high variability due to no set of rules to apply, muscle activation is often not directly visible (clothing)
- to address variability:
most-frequent-label method used by the observers in coding a specific
body expression: low cost, easy approach, very useful when the level of variability between observers is low.
Each label is weighted against observers’ ability to read others’ emotions (evaluated via empathy profile questionnaires)

30
Q

Describe one application of automated body expression recognition system.

A

EMO and Pain project: - detects emotional states that are related to fear of
movement (e.g., anxiety, fear of pain, fear of injury) from
people with chronic musculoskeletal pain with 70% recognition rate
- helps personalise type of technology support to motivate the patient to do physical activity
- make patient more aware

31
Q

Four applications of automated body recognition system

A

1) Recognising basic emotions from dance sequences
2) assess the engagement level of children playing chess with iCat robot
3) uses upper body expressions & head movement to identify
people suffering from depression
4) detects emotional states that are related to fear of
movement

32
Q

Speech production mechanism

A

1) Vocal tract:
- vocal cords, various articulators (e.g. jaw, tongue), lips
- repositioning of articulators
changes vocal tract shape ⇒
changes acoustic characteristics of the speech sound
2) Nasal tract:
- velum to nostril
- coupled to vocal tract by velum to produce nasal sound
3) Chest cavity - expands & contracts to force air from lungs
4) Vocal cords:
- vibrate & module air into discrete puffs or broad-spectrum pulses
when tensed
- air is unaffected when relaxed

33
Q

How is voiced sound generated in the speech production mechanism?

A

1) Force air through larynx, with tension of the vocal cords adjusted so that
they vibrate in a relaxed oscillation
2) ⇒ Quasi-periodic pulses of air which
are acoustically filtered
3) Voiced sounds are generated when a stream of air is forced to flow through tensed vocal cords
⇒ vocal cords vibrate
⇒ broad-spectrum pulses
4) Variation of cross-sectional area along vocal tract determines the
resonant modes of vibration in the vocal cords, i.e., formants
5) Vowels are generated when the tongue makes a hump with the upper palate of the mouth cavity, and constriction is exerted on the hump

34
Q

How is unvoiced sound generated in the speech production mechanism?

A

1) Produced by turbulence, as air is forced through a constriction at some
point in the vocal tract
2) Noise-like quality
3) Smaller in amplitude
4) Oscillates much faster than voiced speech
5) produced by exciting the vocal tract with a steady airflow which becomes turbulent at a constriction along the vocal tract, location of which determines sound

35
Q

Comparison of voiced and unvoiced segment

A

1) voiced - highly periodic, fundamental period (about 8.5 ms) is pitch period
2) unvoiced - noisier waveform, smaller amplitude

36
Q

Source filter theory (Fan 1970)

A

1) speech production consists of source activities (generate airflow) e.g. vocal chords. Variation in pitch (freq. of vibration), intensity (airflow pressure), voice quality dynamics (degree of aperiodicity in glottal cycle)
2) and vocal tract shape filtering, which modulates airflow
3) air stream through vocal cord is modulated by articulators ⇒ spectral changes in speech signal
4) Interaction & interplay between voice source activities & articulatory controls also contribute to speech sound modulation

37
Q

Pitch

A

Rate of vibration of vocal cords (i.e., fundamental

frequency), inversely proportional to size/length of the vocal cords, ⇒ children & women produce sound of higher pitch

38
Q

Intonation

A

1) means for conveying information which is independent of
the words & their sound
2) modulation of pitch

39
Q

How can speaker and recording variability be removed?

A

1) Using z-score normalisation (L16 p3 slide 9)