09 Social Signal Processing Flashcards

Question 1

Q

Define “Social Intelligence” for IT systems! (1sentence)
Which parts of Your definition apply to the field of Multi Agent Systems and which parts
are related to Social Signal Processing?

Answer

A

Ability to express and recognize social signals/social behaviors from other human
and IT-agent individuals in order to “function” in a society with other human and
IT-agent individuals in view of (pareto-)optimizing own and other IT agent’s and
fellow human’s utility function (survival, reproduction, …) via cooperation.
green -> Social Signal Processing
blue -> Multi-Agent Systems

Question 2

Q

Characterize Reality Mining! (1 sentence) What is the relation between Reality Mining and
Social Signal Processing?

Answer

A

Reality Mining analyzes all available traces of human behavior (social and
nonsocial) and derive models for this behavior to get scientific knowledge and
applications (e.g. predictions)
Reality Mining may use SSP techniques

Question 3

Q

ame 3 examples for social signals/social behavior and name 3 examples for behavioral
cues (S)

Answer

A

Social Signals (Expressing attitude towards elements of a social setting):
mirroring (if mutual attraction)
aggressive turn taking behavior
expression disapproval of sth. (e.g. via disapproving looks)
expression of sympathy/empathy
Behavioral Cues:
facial expressions
body posture / interaction geometry
gestures
expressives (laughter, …)
emotions reflected in speech prosody (rhythm, intonation, stress)

Question 4

Q

Define behavioral cue! (1 sentence) What is the relation between social signals and
behavioral cues?

Answer

A

Behavioral Cues are (series of/parallel/overlapping/single/…) time-series of
perceivable or measurable non-verbal physiological activity.
Multiple behavioral cues (vocal behavior, posture, mutual gaze, interpersonal
distance, …) combine to produce a social signal.

Question 5

Q

What is prosody? (1 sentence)

Answer

A

Prosody is the quality of the voice when someone speaks, e.g. pitch, tempo,
energy, …
Often used for social signal detection from audio

Question 6

Q

For SSP: What is the advantage of unconscious social signals vs. conscious social
signals? (1 sentence)

Answer

A

Unconscious signals are honest signals, which allows to deduce the actual/true
state/social attitude while conscious signals can be faked more easily

Question 7

Q

Facial expressions: What are Action Units (AUs)? (1 sentence)

Answer

A

Action Units represent the smallest discernable facial movements which are used
in the Facial Action Coding System(FACS) to describe Face signs.

Question 8

Q

Name the 6 basic emotions (after Ekman)!

Answer

A

fear, sadness, happiness, anger, disgust, surprise

Question 9

Q

Vocal Behavior: What are Linguistic Vocalizations and Non-Linguistic Vocalizations? (For
each: 1 sentence plus 1 example) What is Backchanneling? (1 sentence)

Answer

A

Linguistic Vocalization (or segregates) are non-words:
Prolonged “ääähm” -> embarrassment/feeling uncomfortable in social situation
Non-linguistic vocalizations are other verbal sounds used as social signals to
express boredom, sexual interest, anxiety, …
e.g. laughter, crying, groaning
Backchanneling describes that, during a conversation, listeners respond to what
is being said in a verbal or non-verbal way to signify the listener’s attention,
understanding, agreement, etc. (nodding, “yeah”, “hmmm”, …)

Question 10

Q

Vocal behavior: Name and explain in 1 short sentence each three classes of silence!

Answer

A

Hesitation silence: occurs when the speaker hesitates, e.g. while explaining
difficult concepts
Psycholinguistic silence: occurs when the speaker has en-/decoding difficulties
language wise
Interactive silence: used to express respect, doubt, ignore people or to attract
attention to other forms of communication (e.g. gazes)

Question 11

Q

Name and explain in 1 sentence each 3 steps/sub-problems of Speaker Diarization!

Answer

A

Step 1: Segmentation into speech/non-speech
First the features get generated by digital signal processing, using Fourier- and
other transformations and using MEL filters to get MEL cepstrum coefficients
Then several trained binary classifiers are used to distinguish between speech
and non-speech on the computed features
Step 2: Detection of speaker transitions
The speech parts get split into segments
Statistical methods then decide whether two segments belong to the same
speaker or whether one interval contains one or two speakers
Step 3: Clustering of segments
The segments get clustered with a clustering method, e.g. hierarchical bottom up
clustering that merges segments with most similar models (Gaussians) and cuts
dendrogram at maximum likelihood

Question 12

Q

Coarsely define optical flow and derive the optical flow equation!

Answer

A

Motion pattern of pixels, represented by vector field of velocity V(x, y, t) of
intensity:

Question 13

Q

What is the role of context in Social Signal Processing?

Answer

A

Behavioral cues can have different meaning if happening in different outer
contexts
Multi-modal combination/fusion of social signals (e.g. audio and interaction
geometry)

09 Social Signal Processing Flashcards

(13 cards)