Lecture 5 Flashcards
Pitch concepts 2?
Pitch salience strength of sensation
Octave equivalence
Pitch height, highness or lowness of a pitch
Pitch chromapitch class
Pitch concepts 1?
Spectral pitch, harmonic of a complex tone which can be heard out.
Virtual pitch, residue pitch
perceptual “fundamental” pitch but not necessarily present in the sound. In other words it’s spectral pitch is heard but is in fact missing
Spectral and temporal cues of a pitch
Pitch is perceived on what type of scale?
Logs rhythmic, like amplitude. These graphs shows that musical pitch (along the y axis) is related to the logarithm of the frequency (along the x axis). A chromatic scale is derived by dividing the frequency spectrum between two notes that are an octave apart into twelve equal steps. The graph on the left represents a chromatic scale divided into 12 linear steps (i.e., each ascending step represents an increase of 21.8 Hz), while the graph on the right represents a chromatic scale divided into 12 logarithmic steps (i.e., each ascending step represents an increase of 1.059:1). This means that constant frequency ratios at different places along this dimension result in constant perceived differences. Musicians call these constant differences pitch intervals. So an interval ofan octave is always a ratio of 2:1, a fifth is always about a ratio of 3:2, an equal- tempered semitone is always a ratio of 1.059:1, etc. When you listen to diatonic and chromatic scales on a linear frequency scale, they sound very strange, partly because the size of the intervals seems to get smaller and smaller. This graph shows that musical pitch (along the y axis) is related to the logarithm of the frequency (along the x axis). This means that constant frequency ratios at different places along this dimension result in constant perceived differences. Musicians call these constant differences pitch intervals. So an interval of an octave isalways a ratio of 2:1, a fifth is always about a ratio of 3:2, an equal-tempered semitone is always a ratio of 1.059:1, etc. When you listen to diatonic and chromatic scales on a linear frequency scale, they sound very strange, partly because the size of the intervals seems to get smaller and smaller.
Effect tone duration on pitch salience
It takes a certain amount of time (i.e., a certain number of cycles of a periodic sound) to get a clear sense of its frequency, upon which pitch depends. Plots (a) to (d) show
what happens to the frequency spectrum of a 2000 Hz pure tone when only 1 (a), 10 (b), 100 (c) or 1000 (d) cycles are presented. The shortest one sounds like a click and
the longer it gets, the clearer the pitch becomes. Note that the clicks of a single cycle have different timbres because the frequency position of the peak of the broad
spectrum still depends on the period. We call the clearness of the pitch perception
the pitch salience.
Spectral pitches and the missing fundamental phenomenon.
Play out harmonics to hear spectral pitches
These diagrams show the frequency spectra of a piano tone played at three different
dynamic markings (ppp, mp, ff from bottom to top). Note that the relative level of the harmonics changes considerably and, most importantly, that in the softest note, the
bottom three harmonics are completely absent. However, the pitch of the note
doesn’t change with dynamics. This is called the “missing fundamental”
phenomenon. If pitch was determined by the presence of the fundamental frequency, there should be a difference in pitch, which there isn’t, so something else must be
responsible for the pitch. In the sound example, the lower harmonics are
progressively removed. The pitch stays the same even though the timbre changes. Forsome people, the pitch of the lowest harmonic sometimes stands out. This is a
spectral pitch, i.e., the pitch of a single frequency component heard separately.
Masking spectral virtual pitch, what does this not prove
Missing fundamental is not hidden by noise.
This diagram shows a way to determine which harmonics are responsible for the
perceived pitch. In the upper panel, harmonics 5-11 of a 200 Hz (missing)
fundamental are present and the pitch heard is equivalent to that of a tone at 200 Hz. The pitch is indicated in red. Due to the creation of difference tones (1200-1000=200
Hz, 1400-1200=200 Hz, etc.), which result from nonlinearities in the cochlea, it may
be that although the fundamental frequency is absent, it gets generated in the inner
ear anyway. If that were so, we could mask it with a low-pass noise, as shown in the
middle panel. If the pitch disappears in this case, we conclude that the difference
tones are responsible for the pitch of the missing fundamental. However, if the pitch
is created some other way by the information carried by the higher harmonics, then
masking them with a high-pass noise should make the pitch disappear, as in the lowerpanel. Indeed the pitch disappears with the high-pass noise and not with the low-
pass noise. One candidate for carrying the pitch is the global periodicity of the
amplitude envelope created in higher auditory channels due to the interactions
among the higher harmonics, a periodicity that would be equal to the difference
period, which is indeed equal to the period of the missing fundamental.
Virtual pitch shift of I harmonic tones
With a small number of partials, separated by the same frequency difference (e.g. F-
200, F, F+200 Hz), a periodically fluctuating amplitude envelope is created with a
period equal to one over the difference (1/200 in the case cited). If the partials are
harmonic (e.g. 800, 1000, 1200 Hz) a clear pitch equivalent to 200 Hz is heard and is easily extracted from the amplitude envelope period (t in panel a). What happens if
the partials have a constant difference, but are not all related to a fundamental
frequency that is equal to this difference? For example, 850, 1050, and 1250 Hz havea frequency difference of 200 Hz, but are the 17th, 21st, and 25th harmonics of 50 Hz. While the amplitude envelope still has the same period, the fine structure of the
waveform inside the envelope is different between adjacent “packets”. This sound
gives rise to two different pitches, neither of which is 200 Hz. One is 210 Hz and the
other is 180 Hz. These can actually be predicted by the times between local peaks in the waveform (t1 and t2 in panel b).
If one does a systematic experiment on this kind of sound, you can see that,
depending on context, any given 3-component sound with frequency separations of
200 Hz between the partials can have as many as three or four different pitches,
EVEN for purely harmonic sounds. For example, a combination of components at
1600, 1800 and 2000 Hz [the position along the x axis labeled 1800), can give pitches at about 165 Hz, 175 Hz, 200 Hz (the normal case), or 225 Hz]. These are obtained bystarting at the harmonic case (centre frequency equal to a multiple of 200 Hz) and
then progressively increasing or decreasing the centre frequency, keeping the
frequency difference constant. This approach leads listeners to hear these other
pitches that, in the harmonic case, are not usually heard. So the sequential context
can lead you to hear different pitches in the same complex sound.
Repetition pitch (pulse pairs)
Another temporal feature that can give rise to a pitch percept is a repetition. If two
pulses or clicks are separated by a time t, a pitch is heard at 1/t Hz. As the pulses getcloser, the pitch gets higher and vice versa. This pitch is sometimes referred to as
repetition pitch.
A more general case of repetition pitch occurs when a broadband signal such as whitenoise is delayed, attenuated and then added to (or subtracted from) the original
undelayed, unattenuated signal. This kind of signal is called comb-filtered noise. In
this way, there is a repetition of the whole waveform in a continuous manner. The
resulting spectrum is shown in the middle panels. Cos+, A=0 is the result when the
signals are delayed by T sec and added without attenuation. Cos+, A=10 dB occurs
when the signals are added, but the delayed one is attenuated by 10 dB compared to
the original. Cos-, A=0 occurs when the delayed signal is subtracted without
attenuation. Note that the added signals have peaks with regular spacing like a
harmonic sound. If T=1 millisecond, the peaks are at 0, 1000, 2000, etc. Hz. It gives a
noisy signal with a pitched quality to it. This is basically what a flanger does
electronically. The solid line in the lower right panel shows that the matched pitch is
equivalent to 1/T. The subtracted version is shifted by half of 1/T. This is like a noisy
version of the pitch shift of harmonic sounds we looked at previously, and it also gives multiple pitches, as is shown by the dashed lines in the lower right panel. The pitch
part of Lab 2 will use this kind of stimulus.
Pitch of amplitude-modulated broadband noise
Pitch can also be extracted from amplitude modulated noise, at least for some
modulation frequencies. The graph shows the percentage of correct identification of musical intervals between two sounds with pitches created by amplitude modulationof white noise. So a sound modulated at 220 Hz (A3) followed by a sound modulated at 440 Hz (A4) should be identified as an octave. The modulation frequency of the
higher sound is shown along the x axis. The three curves are the results for three
different listeners. Notice that modulations in the range of about 50 to 400 Hz are
fairly well perceived as pitch intervals, although the upper limit varies a great deal
between listeners.
Model of pitch calculation in the auditory system
This diagram shows a model of pitch perception. Very similar versions were
developed by Leon van Noorden in the Netherlands and by Brian Moore in the UK. It
presumes that an autocorrelation is calculated on the output of each auditory filter.
An autocorrelation shows all of the periods present in a signal. So in (a) you find
peaks at 1.67, 3.33, 5, etc. milliseconds (corresponding to 600, 300, 200, … Hz - all
submultiples of 600 Hz). For a complex tone with the third, fourth and fifth harmonicsof 200 Hz, the autocorrelograms of the channel responding maximally to those
partials are shown. The model presumes that all of these are summed, giving rise to
the graph in (d). Since the period of 5 ms (200 Hz) is present in all three, it has the
highest peak and corresponds to the pitch (and not surprisingly to the fundamental
frequency).
Octave equivalence and pitch similarity
If you ask people to rate the similarity of tones with different pitches, similarity tends to fall off as they get farther apart, up to a point at which it increases again and hits a
peak at the octave. So two tones separated by an octave (c to c’) seem more similar
than two tones separated by a tritone (half an octave on a log frequency scale, c to
f#), even though the tritone is a smaller interval. This high similarity of the same pitch across octaves has given rise to them having similar or identical names in musical
scales of most cultures. This identity across octaves is called pitch chroma. The simple difference between pitches from high to low is called pitch height. Some theorists
think that pitch height is coded along a tonotopic dimension, whereas pitch chroma isbased on temporal periodicity. Indeed, at high frequencies above 4000 Hz you can
perceive if pitches are higher or lower due to spectral cues involving the activity
distributed across the basilar membrane; however, it is no longer possible to
determine the exact interval, which depends on chroma perception, because the
temporal information is no longer reliable at very high frequencies (the nerve fibres
can’t fire with reliable temporal structure at those rates).
Multidimensional model of pitch perception
Roger Shepard has represented the relation between pitch height and chroma in a
multidimensional model of pitch, where pitch perception winds up a helical spiral on
the surface of a cylinder. Chroma is represented by the position relative to the circularbase and the height is the vertical displacement. In this spatial representation, notice
that the distance between two pitches related by an octave is smaller than that
between two pitches related by the tritone (augmented fourth).
Pitch circularity with constant spectral envelope
Shepard’s 3D representation of pitch suggests that if you can neutralize the height
dimension (related to the tonotopic representation) and preserve the chroma circle,
you could create a completely circular pitch percept. This would be similar to Escher’s eternally ascending or descending staircase. To do this, he created a bell-shaped
spectral envelope on a log frequency scale, and then used a set of frequency
components that were spaced at octave intervals (like harmonics 1, 2, 4, 8, 16, 32, 64,128, 256 and 512), thus all having the same chroma. If you move all the frequencies
up or down by an octave, you have exactly the same signal. So if you make a scale or
a glissando, you eventually come back to where you started even though you are
always moving up or down.