phonation measurement 1 Flashcards
whats the diff bw jitter and shimmer?
- jitter: stability of F0 across time (closely related to pitch)
- shimmer: stability of intensity across time (closely related to loudness)
whats the diff bw HNR and VOT?
- HNR (harmonic : noise ratio): overall spectral characteristics of sound
- VOT (voice onset time): timing of vocal fold activity relative to other speech events
what are 3 problems with evaluating F0 based on pitch?
- pitch judgements are influenced by top-down info
- pitch judgements are non-linear
- pitch is not a pure index of F0
T or F: humans are more sensitive to pitch variations at high frequencies than low frequencies
false – more sensitive to pitch variations at low frequencies
how many semitones are in an octave?
12
what is the relationship between going up an octave and frequency?
each octave up = doubling of frequency (in other words: each octave down = halving of frequency)
the amount of F0 change is different at lower vs higher frequencies. the ear is more sensitive at the ____ end.
lower
what is peak-picking? (3)
- Old analog method that detects peaks in signal above a threshold
- Creates a pulse for each peak
- Counts the pulses per second to get F0
Explain what counting zero-crossings is (2)
- Counts every time signal crosses 0 (x-axis)
- Need to double your half cycle to get correct values (??)
Why are peak-picking and zero-crossings not ideal for speech samples? Is there any solutions?
- Saw tooth signal = highly irregular – difficult to set threshold, inaccurate data
- Note: some improvement is possible via low-pass filtering
What is accelerometry? What are its pros and cons?
- Accelerometry: uses accelerometers which are sensitive to body vibrations and can detect vocal fold movements
- Pros: cheap, easy, and yields clean signal
- Cons: have to build it yourself
How can you get F0 from a wide band spectrogram?
Count vertical striations within a time frame: striations / time frame (convert time to seconds)
How can you get F0 from a narrow band spectrogram?
Look at first harmonic (horizontal striations)
T or F: vowel based F0 is different from speaking F0
True
Why do we need to use a big chunk of passage sample (i.e., 14 seconds) to measure pitch? (2)
- Because vowels vs consonants change pitch… i.e., pitch is higher for /i/ than /a/
- Accuracy of +/- 3Hz requires ~14 seconds