Lecture 4 - Time Domain Flashcards

Question 1

Q

What are the three broad categories of speech sounds?

Answer

A

voiced
unvoiced
silence

Question 2

Q

Give the different intervals for short, medium and long in milliseconds and their applications.

Answer

A

Short intervals; 5-20 msec
- uncertainty due to small amount of data, varying pitch, varying amplitude.
Medium intervals; 20-100 msec
- uncertainty due to changes in sound quality, transitions between sounds, rapid transients in speech
Long intervals; 100-500 msec
- Uncertainty due to large amount of sound changes. Used in cases like finding the audio quality in a google hangout.

Question 3

Q

What’s the typical window overlap in speech.

Answer

A

50%; when we have a hamming window, we want 50% overlap as it gets the rest of the values.

Question 4

Q

Give the difference between a rectangular and hamming window for speech signals.

Answer

A

Rectangular is simple to implement but in frequency domain -> narrow wideband and large sidebands.
Hamming window; BW of hamming twice rectangular and attenuation greater outside passband.

Question 5

Q

What is the downside to zero-crossing?

Answer

A

When there is DC-offset, the signal is shifted up. This can decrease the amount of ZC in a system.

Question 6

Q

What value is the most dominant energy for voiced and unvoiced speech?

Answer

A

Voiced speech; 700 Hz
Unvoiced speech; 2.5 kHz

Question 7

Q

What is the range threshold between voiced and unvoiced.

Answer

A

voiced < 1.5 kHz
unvoiced > 1.5 kHz

Question 8

Q

What is autocorrelation?

Answer

A

Autocorrelation measures the relationship between a time series and a lagged version of itself over successive time intervals.

e.g. Tracking the temperature of a city every day. If today’s temperature is similar to yesterday’s, and yesterday’s temperature is similar to the day before, we say that the temperature data is autocorrelated.

Question 9

Q

Why is pitch tracking useful

Answer

A

Pitch helps distinguish speakers or emotions.
Can be used in tone analysis; in languages like mandarin.

Question 10

Q

What are the challenges in pitch tracking?

Answer

A

Noise
Multiple fundamental frequencies
Is there a speaker?

Question 11

Q

Give reliable algorithms for pitch tracking?