Chapter 3.1-3.2 Flashcards
What is a digital representation of speech?
Digital representations of speech are those such as waveforms, spectrograms, and frequency spectra
- representations of speech signals on a computer are digital/discrete because computers cannot store an infinite amount of data, and they represent information as a series of digits
digital representations of speech vs. analog representations
Digital representations of speech are discrete signals captured via evenly spaced samples in time.
- sound digitization involves the assignment of values to amplitudes of a series of speech samples
- digital representations may appear analog (continuous), however keep in mind that this is often the result of interpolation between data points
Analog representations of speech are things which can store a continuous speech signal
What is an analog representation of speech?
Analog representations of speech are continuous. They are representations that do not convert the sound into a discrete series of samples.
- e.g. records/vinyls: sound waves are imprinted onto the record directly
- e.g. tape recorders: store electric analogs of the acoustic signal
What is A/D conversion?
A/D conversion is short for analog-to-digital conversion.
- another word for digitization
- conversion of the continuous sound wave into a discrete signal
- A/D conversion involves quantization, which is essentially limiting the number of possible amplitudes for a sound sample
sampling rate
the rate at which samples of sound are captured
- the # of times per second that we measure the continuous wave
- samples must be evenly spaced
- the sampling interval is at what time interval (T) points are sampled from the signal
- sampling rates are stated as follows: 1/samping interval –> X samples per X
—e.g. “100 samples per second OR 100 Hz”
—cycles = samples
- The standard is that your sampling rate should be AT LEAST twice the desired maximum frequency of the signal
quantization rate
sampling amplitude
- another word for sampling resolution
—amplitude assigned values
—quoted as a power of 2, in which the exponent is the number of “bits”
—e.g. “8-bit” = 2^8
What is D/A conversion?
D/A conversion is short for digital-to-analog conversion.
- another word for playback
Nyquist frequency
The Nyquist frequency is the highest frequency component that can be captured with a given sampling rate
- in other words, the highest frequency that you can capture in a sound sample, given a certain rate that you’re sampling at
- the Nyquist frequency is always half of the sampling rate
- You should always sample at twice the desired maximum frequency in order to sufficiently capture the sound signal
—We sample at around 40 kHz because we can hear up to about 20 kHz, so we want to sufficiently capture all audible frequencies in a sample.
quantization rate
the number of bits needed to encode a certain number of (amplitude) values
—”bits” translate to powers of 2: e.g. “8-bit” = 2^8 = 256 values
—the amplitude is divided up into however many values the quantization rate is
What is clipping and why does it occur?
Clipping is when the highest amplitude of a sound (being sampled) is higher than that of the maximum amplitude allowed by the quantization
—In other words, clipping occurs when the peak amplitude of the sound sample is greater than that of the maximum amplitude value of the quantization
—It’s called “clipping” because data is lost (“clipped”) due to the limitations of the quantization
What effect does the sampling rate have on a sound sample? What effect does the quantization rate have on a sound sample?
- The higher the sampling rate, the better the time resolution
—higher sampling rate = more samples per sampling interval - The higher the quantization rate, the better the amplitude resolution
—higher quantization rate = more amplitude values
What happens when the sampling rate is too low?
When the sampling rate is too low, aliasing occurs. This is when the original sound being sampled gets distorted and a false or erroneous sample is collected.
- Aliasing occurs because the continuous signal contains frequency components that are higher than 1/2 of the sampling rate
Why do we typically sample sound at 44,100 Hz (or 44.1 kHz), for CDs particularly? Why do we sample sound at 16-bits?
Some humans can hear up to 20 kHz, and according to the Nyquist theorem, we should sample at least twice that frequency.
- 16-bits (quantization rate) is enough to cover the dynamic range of music
lossy vs. loss-less data compression
lossy data compression remove information from the sample, which means that the original waveform of the sound cannot be restored from the sample.
loss-less data compression retains the information from the original sample, which means that the original waveform can be restored.