Lecture 7 Flashcards
Key terms coherent behaviour and grouping behaviour mean?
Coherent Behavior – common behavior among acoustic components, which results inperceptual fusion or grouping, and which is signaled by a small number of acoustic
cues.
Grouping Processes – a set of processes that result in the formation, integration,
and/or segregation of auditory images
What is Auditory Image
A metaphor
Psychological representationn of sound source.
Made from a Perception, mEmory, imagination
May be a combination of sounds
3 auditory grouping processes
Concurrent: Auditiry fusion - organization of musical surface into acoustic events
Sequential: auditory streaming
Segmental; chunking
What does it mean when we say organization precedes attribute computation?u
Two sources (oboe and snare drum) that are playing simultaneously create acoustic
waves that add together perfectly in the air, creating a complex waveform that arrivesat the eardrum. The 3D representation in the square brackets shows the time-varying spectrum with frequency, time and amplitude. This gets transformed into a time-
varying neural spectrogram (the variation of nerve firings over time in each of the
frequency-specific channels of the auditory nerve). From this ‘raw’ representation,
the brain must collect together the bits of information that come from the same
sound source, form a mental image of that source, and only then can it compute the
perceptual attributes such as pitch, timbre, loudness, etc.
Concurrent grouping cues
auditory event formation
The
principle of ecological probability states that it is highly unlikely for multiple
independent sound sources to have perfectly synchronous onsets of all their
frequency components, (1. Onset asynchrony) for all of those components to be related by a common
period or fundamental frequency, (2. Harmonicity or common period) for all of their frequency components to maintain perfect integer ratios among themselves when they change in pitch (glissandi or
vibrato or note change), (3. Coherent frequency behaviour) and to all come from the same point in space. (4. Common spatial origin) So any
differences in any of these cues are likely to signal to the brain that there are two or more sources present in the environment and give the information necessary to
group together their respective frequency components.
Onset synchrony
When frequency components start at the same time, they tend to fuse into a single
event, and the attributes of pitch and timbre depend on all of them. If each
component starts at a different time, several events are heard in sequence, each withits own (spectral) pitch and (pure) timbre.
i.e. events that start before other events and/or endafter them will be more easily heard out.
Amplitude modulation coherence
This sound example demonstrates that when amplitude modulation is applied to
individual harmonics that is incoherent across harmonics (i.e., each harmonic has itsown amplitude modulation pattern that is independent from that of all the other
modulation patterns), an impression of source multiplicity is generated. This shows
that modulation incoherence generates multiplicity, whereas modulation coherence would generate oneness.
Similarilyt, another study show asynchrony improves segregation despite amplitude difference in vowel sounds.
Harmonicity
Harmonicity is one of the strongest cues for perceptual fusion. When all frequency
components have a common period, a single fused sound entity is perceived. If oneof them is mistuned from the harmonic relation sufficiently, it stands out as a
separate source and has its own pitch. In this case, multiple entities are heard, one
complex, the other pure.
This experiment by Moore and colleagues tested the mistuned harmonic hypothesis. They asked people to adjust a complex tone’s pitch to that of another complex tone
that had one harmonic mistuned by varying amounts. As the harmonic was mistuned (x-axis), it ‘pulled’ the pitch (y-axis) of the complex in the same direction up to a
certain point. After that it had less and less of an effect on the global pitch. The
decrease in effect began at about 3% and was almost gone by 8%. Beyond 8%
mistuning, the component was heard independently and had very little effect on the
pitch of the complex tone. This result suggests that there is a window of tolerance onthe harmonic template that affects the global pitch, as well as the segregation of
frequency components on the basis of harmonicity.
At the level of multiple harmonic series, as one would have with two voices singing
two vowels, for example, a difference in pitch between the two voices helps
segregate them and to identify them independently. As before, when the two vowelshave different levels (10 dB difference), it is easier to identify the loud one when they are synchronous and have the same fundamental frequency. When they are still
synchronous, but their fundamental frequencies differ by about 6% (a semitone), it
becomes easier to identify the softer vowel.
One way to explain the effect of harmonicity is by way of a harmonic template with a certain window of frequency tolerance. A harmonic series would maximally stimulatea template tuned to the fundamental frequency. Two harmonic series would
stimulate two separate templates. A harmonic series with a mistuned harmonic,
would stimulate two templates. Three harmonic templates tuned to different
fundamental frequencies are shown. Notice that the ‘holes’ are not infinitely thin and allow a component to fall within the template even when they are slightly mistuned.
The output of each template is the sum of the holes that have harmonics in them. A
given harmonic sound would partially stimulate several templates, but the one that is maximally stimulated corresponds to the one with the sound’s fundamental
frequency.
Frequency modulation coherence
This spectrum shows the simultaneous presentation of three vowels. It is difficult toextract the spectral envelopes of the individual vowels when everything is perfectly steady.
When one or more vowels are modulated in frequency (vibrato), they form an
independent auditory image from which the spectral envelope can be extracted,giving rise to the timbral attribute that corresponds to the vowel identity.
This spectrum shows the spectral envelope that can be extracted from the vowel /a/with vibrato, giving rise to the timbral attribute that corresponds to an /a/.
This example from Archipelago (1982-3) by Roger Reynolds shows a schematic
diagram of the spectrogram of an oboe sound in which the vibrato on the even
harmonics changes half-way through the sound, making them segregate. The soundsplits into a clarinet-like sound whose timbre and pitch are computed from the odd harmonics and a soprano-like sound whose timbre and pitch (at the octave) are
computed from the even harmonics. Why does it have a pitch an octave higher?
Multiple cues for perceptual fusion
In this sound example, an harmonic series is formed with the French horn at the
fundamental, celesta at the octave (2nd harmonic), piccolo at the octave plus a fifth
(3rd harmonic), celesta at the double octave (4th harmonic), and piccolo at the doubleoctave plus a major third (5th harmonic). They start and stop together and move in
perfectly parallel motion, keeping the 1:2:3:4:5 ratios intact.