Week 10 Flashcards
What are the prerequisites for collecting a multimodal dataset and what are the
challenges involved?
1) quantity of data
2) high diversity w.r.t. subjects’ age, gender & culture, and situational context
3) balanced distribution of instances among classes, or along the range (for continuous models)
4) quality of data (i.e., adequate, realistic & naturalistic)
5) adhering to ideal capture conditions
Current situations about multimodal datasets
1) smaller in size than unimodal datasets
2) more often recorded in lab most are bimodal - audiovisual
3) physiological measures or speech alongside depth images becoming available
What ethical issues must be addressed before creating a multimodal dataset?
1) affect can be very private ⇒ subjects might not always
agree with making genuine & spontaneous affect data
available for study, especially with video & audio
2) moral principles guiding research:
- how ethical issues influence selection & conduct
- subjects are informed & provided consent ⇒ might reduce spontaneity and naturalness of the data
3) different levels of release for different contained modalities
4) whether the research will be beneficial to subjects
5) subjects should not be harmed
Challenges of multimodal data collection
1) To obtain naturalistic display of affects
2) Complex setups for multimodal recordings require careful control of lab conditions, observers’ paradox - presence of experimenter &
awareness of being recorded may influence the subject
3) Synchronisation of multimodal capture streams, different devices/timescales/sampling
4) Sufficient number of independent labellers (or self-labelling):
- not all modalities’ recorded data are sufficiently informative
for human labellers to make affect judgments
- may require self-assessment ⇒ disruptive w.r.t. an
awareness of being in an experiment
10 steps to consider for multimodal datasets:
1) Step 1 - To consider ethics to guide the data collection
2) Step 2 - To consider type of new data and possible reusing of existing material
3) Step 3 - To consider collection of meta information including
demographic data
4) Step 4: The challenges in collecting data from multiple
devices.
5) Step 5: The choice of model or models, and temporal unit
of analysis.
6) Step 6: The labelling method for separate modality or in combination.
7) Step 7: Standardising to foster compatibility of the meta data & the annotation.
8) Step 8: Partitioning data for modelling, optimising & testing.
9) Step 9: Verifying perception and baseline results conducted
individually per modality or for modality combinations.
10) Step 10: The release of the data with highest spread and usage.
What considerations should be made when choosing the appropriate model/s for
a multimodal dataset?
1) emotion model
continuous or categorical (influenced by modalities)
2) temporal unit of analysis:
- physiological measures & video - annotated on a per-frame basis
- acoustic parameters - extracted over larger chunks, e.g., words or turns (a turn is a time during which a subject speaks)
3) compromise - annotation in continuous dimensions (e.g., arousal & valence), but also in time (e.g., every 100 ms):
- for diverse mappings, e.g., averaging over a certain chunk
4) use multiple models:
- enriches flexibility of database
- requires considerable extra effort
- could be applied for modality-specific annotation
What considerations should be made when recording/re-using for a multimodal dataset?
1) recording new data
2) reusing existing material
usually only sparsely available (especially for multimodal)
3) data cover acted, induced & naturalistic emotions
4) increasing use of mobile & wearable devices for naturalistic data
What considerations should be made when synchronising streams for a multimodal dataset?
1) audio & video - a challenge if using several microphones & cameras
2) worn physiological devices not routed via same computer
3) use aligned time stamps or markers for later
synchronisation, may need to be repeated during a take (or trial) to compensate for temporal deviations
What considerations should be made when labelling for a multimodal dataset?
1) not all modalities can be easily annotated by a human rater, e.g., physiological signals
2) self-assessment is not always an option:
- ⇒ several external labellers serve as “expertise of the mass”
- e.g., by majority voting or by taking mean & median (for continuous emotion models)
- number of labellers proportional to level of subjectivity or ambiguity of the labelling, & the complexity of the model
3) multimodal can be annotated modality-wise or in combination:
- acoustic & physiological data - better in conveying arousal
- video or textual data - well suited to convey valence
- not all modalities are necessarily present at all time, e.g., speech
What considerations should be made when partitioning for a multimodal dataset?
1) divides data into partitions for modelling, optimising & testing
2) provides default or suggested form of partitioning, facilitates comparison of results & findings
3) development partitions in addition to training & testing partitions
4) use cross-validation to enable use of as much data as possible for all partitions
5) independence of subjects, context, etc. e.g., by leaving out a subject or subject group at a time
6) keep good balance of all factors throughout the partitions
7) transparent & easy to reproduce, noting that random partitioning is
suboptimal
What considerations should be made when verifying perception and baseline for a multimodal dataset?
1) independent perception test with individuals other than the annotators
2) conducted individually per modality or for modality
combinations
3) via crowd sourcing
4) include machine-based baseline recognition results
What is the role of the evaluator-weighted estimator in the creation of multimodal dataset?
1) to reach rater-weighted gold standard
2) average of individual evaluators’ responses takes into account that each evaluator is subject to an individual amount of disturbance during evaluation
3) weights measure the correlation between the individual annotator’s estimations & the average ratings of all evaluators
4) if the weights are constant among raters, the gold standard is the mean of the raters’ continuous labels
Quality assessment for multimodal affect databases
1) Gold standard is practically never reliable:
- training & testing labels are ambiguous to a certain degree, as subject’s emotion is usually difficult to assess
- an emotion may not be mapped unambiguously to a single category or a point in space
2) Groundtruth - actual truth as measured
3) In interpreting results:
ideally ground ⇒ trained models that process affect data are error-prone, classification error might not be so wrong in ambiguous
cases
4) ⇒ use several annotators to achieve a reliable gold
standard close to the groundtruth
What method for measuring reliability if affect is modelled continuously?
1) (mean) correlation coefficient (CC) or (average) mean linear/absolute error (MLE, MAE)
2) mean square error (MSE)
3) standard deviation
4) use correlation if using only one measure
What method for measuring reliability if affect is modelled categorically?
Fleiss’ Kappa K (most frequently used):
- all raters to rate all data
- if labellers agree throughout ⇒ K equals 1
- if they agree only on the same level as chance would ⇒ K=0
- negative values ⇒ systematic disagreement
- values of 0.4 to 0.6 ⇒ moderate agreement
- values > 0.6 ⇒ good to excellent agreement
Why are ethical issues important in affective computing?
1) AC has profound moral significance because:
- it raises prospect of creating things that mimic human free will or impinge on it
- the public may feel that it is ethically unacceptable
2) Concerns to be countered involve moral principles:
- certain kinds of unnaturalness are bad
- a computer which seems to have emotions is unnatural
- In what scenarios where it is not necessary to obtain prior consent when recording subjects’ voices and images?
1) research consists solely of naturalistic observations in
public places, & the recording will not be used in a manner that could cause personal identification or harm
2) research design includes deception, & consent for use of recording is obtained during debriefing
Deception in research:
1) should not be conducted unless it is justified by the study’s significant prospective scientific, educational or applied
value, & there are no non-deceptive alternative procedures
2) do not deceive prospective participants about research that
is expected to cause physical pain or emotional distress
3) explain any deception that is an integral feature of an experiment to participants as early as is feasible, but no later than at the conclusion of the data collection, and permit participants to withdraw their data
. In what scenarios is informed consent unnecessary?
1) Where research would not reasonably be assumed to
create distress or harm, & involves:
- study of normal educational practices, curricula or classroom
management methods conducted in educational settings
- only anonymous questionnaires, naturalistic observations, or archival research for which disclosure of responses would not place participants at risk of criminal or civil liability or damage their financial standing, employability or reputation,
& confidentiality is protected
- study of factors related to job or organisation effectiveness conducted in organisational settings for which there is no risk
to participants’ employability, & confidentiality is protected
What does Article 8 of the EU Charter of Fundamental Rights say about personal
data?
1) “Everyone has the right to the protection of personal data concerning him or her”
2) Data may not be processed at all unless the subject of the data
has unambiguously given his/her consent
3) Very severe restrictions on the use of data revealing racial or ethnic origin
What are the three ethical themes for affective computing?
Beneficence, deception, respect for autonomy
Discuss the beneficence ethical theme
1) researchers should have the welfare of the research
participant as a goal of any research study
2) morally positive goals - to make technology better able to furnish people with positive experiences and/or less likely to impose negative ones
3) objections to AC:
- unintended damage that might outweigh intended gains in happiness
positive affect has no moral value
4) ⇒ implications:
- remedial, to spare people distress that would otherwise be caused by interactions with affectively incompetent systems
- countering misguided fears that might prevent the increase of the happiness of humanity
Discuss the deception ethical theme
1) general charge that AC is deceptive & cannot avoid being deceptive ⇒ AC systems ‘feel’ emotions
2) systems should not be deliberately engineered to make people believe something that is actually false, natural to fear that flawless logic, endless patience and no conscience are supplemented with ability to manipulate
emotion ⇒ an irresistible persuader
3) between the above two extremes:
- object to signs of emotion that mislead users about the way a system is likely to behave
- object to a system showing some behaviours associated
with an emotion.
Discuss the respect for autonomy ethical theme
1) for people to exercise autonomy, they must have procedural independence, i.e., freedom from factors that compromise or subvert their ability to achieve self-reflection & decide rationally
2) deception violates:
- A duty of honesty
- A duty not to infringe autonomy - if information about a person’s emotional state becomes available, it can restrict their opinions in ways that they would not choose
3) users need assurances that emotional-oriented systems should not undertake any actions that users do not or
cannot endorse