Yusuke Flashcards

1
Q

What are the four sub-disciplines included in communication acoustics?

A
  • Electro-acoustics and audio/speech signal processing
  • Speech science and linguistics
  • Human auditory perception
  • Psychology of hearing (psychoacoustics)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What two electrical components does a sound signal consist of?

A
  • DC (constant value; 0Hz)

- AC (sine wave of various frequencies)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When we analyse a sound signal, what is important to know?

A

It is important to know the amount of each frequency component in the signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the amount of power usually measured by?

A

It is usually measured by the rms of amplitude

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the two forms of a sound signal? And what are their domains?

A
  • Waveform (ie time domain)

- Spectrum (ie frequency domain)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the two types of spectra in a sound signal? And what do they consist of?

A
  • Amplitude spectrum (shows amplitude of sine waves at each frequency
  • Phase spectrum (shows the phase of sine waves at each frequency (between -pi and pi))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the mathematical tool that derives the spectrum from the wave form, vice versa?

A
  • Fourier analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the spectrum of a signal defined as mathematically? And what do we look at?

A

It ranges from -inf to inf. we look at f>0Hz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the frequency ranges for infrasound, audio sound and ultra sound?

A

Infra: <20Hz
Audio: 20-20kHz
Ultra: >20kHz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What frequency range can humans hear?

A

20Hz - 20kHz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What do we use an oscilloscope for?

A

To visualise the sounds waveform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does digital mean in terms of signal processing and what are the two forms it is discretised in and what are their names? Draw a diagram to aid

A

Digital signal = analogue signal disretised in time (sampling) and value (quantisation). Diagram with sampling period and quantisation width lines for an analogue signal converted to digital

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is sampling a process of?

A

Converting a signal from continuous-time to discrete-time domain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the Shannon Sampling Theorem?

A

Continuous time signals with frequencies no higher than fmax can be reconstructed perfectly from its discrete-time signal if samples are taken at fs>2fmax

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the nyquist rate?

A

Minimum sampling frequency to avoid aliasing (2fmax)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is aliasing?

A

It is when spectral components (ie frequencies) above fs/2 are folded back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an anti-aliasing filter for?

A

It filters out the spectral components above fs/2. Often a low-pass filter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is over-sampling?

A

Sampling at above the Nyquist rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why do we oversample instead of samplying at the nyquist frequency?

A

Because of limited roll-off (steepness) of the anti-aliasing filter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is quantisation and what is assigned to each level of quantisation to acquire a completely digital signal

A

Discretisation of a signals value (amplitude)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is pulse code modulation (PCM)? What two things does it apply?

A

It is a technique which expresses a signals values by a set of binary codes. It applies sampling and quantisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is quantisation noise? What determines resolution? What is the tradeoff?

A

Approximation of signal level by discretised values causes quantisation noise. Number of quantisation levels determines resolution. More levels means good resolution but more number of bits -> more memory.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the minimum sampling frequency for most acoustical applications and why?

A

Since audible frequency range is 20Hz - 20kHz, fs is >44.1kHz for most acoustical applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the minimum sufficient resolution to cover dynamic range of human auditory system?

A

16bit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the tradeoff for sampling frequency?

A

Higher sampling freq reduces risk of aliasing, however it requires more memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does a spectral analysis entail? What is the first thing that must be calculated and how is it calculated?

A

It refers to a process that analyses properties of sound by looking into their spectrum.
The first process requires us to calculate the power spectral density (PSD) of the signal from the waveform
This is achieved by the fourier analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the relationship between the power spectral density and the mangitude of the spectrum

A

The PSD of the signal is the squared magnitude of the spectrum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is another name for the fast fourier transform? What is it for and how does it differ from the fourier transform?

A
  • AKA discrete fourier transform
  • It is a digital implementation of the fourier analysis on a digital processor
  • Fourier transform assumes infinite length, in order to use FFT, signal must be truncated (ie cut off the end)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the FFT length?

A
  • Length of the truncated signal N
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is windowing and what is it applied to?

A
  • windowing has non-zero values only for n=0,1,…,N-1
  • truncated = windowed
  • It is applied for truncation that limits the length of a signal within N samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are three commonly used windows? And draw the diagrams.

A
  • Rectangular
  • Hann (Hanning)
  • Hamming (more specific decimals - tighter bell curve)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the short-time fourier transform used for? And how is it implemented?

A
  • The STFT analyses non-stationary signals

- It divides the signal into short frames using windowing, then applies FFt to the signal of each frame

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What are non-stationary signals?

A

Signal whose statistical properties (incl. PSD) vary from time to time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is a spectogram? and what transform is used to calculate it?

A

It is a representation of a time varying PSD of a signal. Short Time Fourier transform is used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What determines frequency resolution in spectral analyses? Draw the PSD for a low N and high N pure tone wave

A
  • The FFT length
  • Low N diagram: normal distributed curve about f0
  • High N diagram, sharper point at f0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the tradeoff when a spectogram is calculated?

A
  • Between time and frequency resolutions:
    A shorter window length improves time resolution, however a shorter window also decreases the truncated length which reduces frequency resolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is the trade-off between spectral distortion and frequency resolution?

A
  • the shape of window function (ie rectangular, hann, hamming) affects the frequency resolution.
    For the rectangular window, spectral distortion is relatively less, however frequency resolution is worse.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Which window types are most commonly used for audio?

A
  • Hamming and Hann
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is a frequency band?

A
  • Refers to a range of frequency in the spectral representation of signals
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What are the seven audio frequency subbands?

A
  • Sub-bass
  • Bass
  • Low midrange
  • midrange
  • Upper midrange
  • Presence
  • Brilliance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is an octave band?

A

A band is said to be an octave in width when the upper band frequency is twice the lower band frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What is a 1/3 octave band?

A

A frequency band whose upper band frequency is the lower band frequency times 2^(1/3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is the characteristic of a linear system?

A

(a1x1 + a2x2)* linear system == lineara1x1 + lineara2x2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the characteristic of a time invariant system

A

-> delay -> time invariant system = -> time invariant system -> delay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Why do we focus on Linear Time-Invariant systems?

A
  • Simplification of mathematical analysis

- Greater insight and understand of the systems behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is the word for two cascaded LTI systems that produce the same result

A

They are said to be commutative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the output of an LTI system when the input is a unit impulse?

A

Impulse response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is the mathematical word that combines an input signal with the system’s impulse response?

A

Convolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is a RIR?

A

The room impulse response describes the properties of the acoustic transmission channel in a room from the sound source to the position where the sound is observed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is a free field?

A

Acoustic environment where no reflections are observed (ie anechoic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

If an arbitrary sound is emitted from a source, how can the observation at the microphone’s position be simulated?

A

By convoluting the source and RIR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is frequency response? And how is it calculated?

A
  • It shows how the system behaves (phase and amplitude) to a sinusoid of a particular frequency
  • Typically complex valued
  • Calculated by applying fourier analysis to the impulse response of the system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

How is the output spectrum of a system calculated?

A

Multiplication of the input spectrum and the frequency response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is the spectrum of a unit impulse?

A

Identical values for all frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What does a filter do? What do most common filters do?

A

It removes an unwanted component or feature from a signal. Most commonly used to remove certain frequencies and/or reduce noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What are the four properties of a filter?

A
  • Type - can be read by the shape of the amplitude response
  • Passband - range of frequencies where the signal is passed through
  • Stopband - range of frequencies where the signal is suppressed
  • cut-off frequency - edge of the passband where there is a -3dB drop from passband gain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

what are the four filter types? Draw the diagrams of each

A
  • Low pass
  • High pass
  • Bandstop (stops frequencies within a given range)
  • Bandpass

For the diagrams, remember non-vertical roll off

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What are the two types of filters?

A

Finite impulse response (FIR) and infinite impulse response (IIR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What is the equation for an FIR filter? And therefore what is the output affected by?

A
  • y[n]=b0.x[n]+b1.x[n-1]+b2.x[n-2]+…+bm.x[n-M]
    where x[n] is the input signal and y[n] is the output signal, and bm (m=1,…,M) are the filter coefficients.
  • The output is affected by the current input and the past input samples.
60
Q

What are two of the simplest designs of a FIR filter?

A
  • Moving Average which can be used for noise reduction, although it is very primitive. Plays the role of a low pass filter.
  • Difference filter: calculates weighted difference between samples. Plays the role of a high pass filter; used for detecting pulses and edges in a signal.
61
Q

What is the equation of an IIR filter? And therefore what is the output affected by?

A

y[n]=b0.x[n]+b1.x[n-1]+b2.x[n-2]+…+bm.x[n-M]-a1.y[n-1]-a2.y[n-2]-…-an.y[n-N]
- Output of IIR filter is affected by current input sample, past M input samples and past N output samples.

62
Q

What are the 4 pros and 4 cons of FIR and IIR filters?

A

FIR:

  • Finite length impulse response
  • Linear phase is realisable
  • Always stable
  • High required order for steep cut-off characteristics therefore computationally more expensive.

IIR:

  • Infinite length impulse response
  • linear phase is only an approximation
  • Could be unstable
  • Low required order for steep cut-off characteristics therefore computationally less expensive
63
Q

What is the requirement for an FIR filter to be stable?

A

Filter coefficients take finite values

64
Q

Draw a diagram of a low pass filter for an FIR and an IIR filter

A

IIR normal looking low pass filter with relatively steep cutoff
FIR gradual with humps

65
Q

What is a filterbank and what is its purpose? What can it be used for?

A
  • It is a set of band-pass filters that is fed the same input
  • It separates a broadband signal into multiple time-domain narrowband signals
  • It can be used as another technique for spectral analysis.
66
Q

What are octave and 1/3 octave filter banks? What are they used for?

A
  • They have the passband of their band pass frequencies corresponding to each octave or each 1/3 octave band
  • They are used to measure the acoustical properties in different frequency bands
67
Q

What is auditory masking?

A

It is the phenomenon where soft sounds become inaudible in the presence of louder sounds

68
Q

What are the two types of auditory masking? Explain.

A

Spectral masking:

  • Occurs when a sound with certain spectral content makes the detection of another sound with different spectral content more difficult
  • Caused by the change of the threshold of hearing by the existence of the other sound

Temporal masking:

  • A sound also masks in time, affecting the audibility of a preceding (backward) or following (forward) sound
  • Forward has a more significant effect than backward.
69
Q

What are the two types of spectral masking? and draw their diagrams. Comment on increasing the level of the types.

A
  • White noise masker - raises the threshold of audibility across the whole frequency band
  • Narrowband noise masker raises the threshold just around the centre frequency of the band
  • Increasing the level of narrowband noise also raises the threshold, as well as high pass and low pass filtered noises.
70
Q

What is the threshold of hearing affected by for forward temporal masking ?

A
  • the level of masker

- the temporal length of the masker

71
Q

What is the difference between forward and backward masking?

A

Forward masking occurs before a sound, backward masking occurs after a sound

72
Q

Draw graphs for forward temporal masking for target sound versus delay time for increasing masker level (white noise)? Explain what happens when the time between masker and target decreases

A
  • negative gradient curve asymptote out.
    increase in masker level shifts left part of curve up but asymptotes to the same spot as the others.
  • The closer the masker and target get, you need a higher level of target test tone.
73
Q

What is the most common way to reduce the bit-rate (i.e. data size) using a perceptual masking model? And how does it work?

A
  • The most commonly used spectral masking-based audio codec is MPEG-1 Layer-3 (ie MP3).
  • The technique varies quantisation levels (i.e. bitrate) to shape the spectrum of the quantisation noise so that it follows the masking curve created by the target signal.
74
Q

What are three examples of masking noise

A
  • Pink
  • Babble
  • Speech-like
  • White
  • Brown
75
Q

How is sound characterised objectively (4 points) and what do these correspond to subjectively?

A
  • Frequency -> Pitch
  • Amplitude -> loudness
  • Amplitude spectrum -> timbre
  • Temporal length -> duration
76
Q

What happens to loudness when the sound has a broadband amplitude spectrum?

A

It increases

77
Q

What is the definition of a phon and a sone and what is their relationship with dB?

A
  • 1 phon = 1dB SPL at 1kHz - linear scale progression

- 1 sone = 40db SPL at 1kHz - log scale progression

78
Q

How is loudness affected when the target is masked by noise? Draw a diagram of loudness versus SPL with an increasing masker level.

A
  • The higher the noise level, the lower the targets loudness

- Positive gradient. Increasing mask noise level shifts curve to the right

79
Q

What is the mel scale? Draw it on a diagram against frequency with a y=x line for reference

A
  • It represents the average human perception of pitch against frequency of sound.
  • It suggests that doubling the frequency (ie 1 octave) <500Hz is approximately 2 times higher, but >500Hz it feels less than two times higher.

Diagram:

  • 1kHz is anchor point = 1000mel
  • Mel scale starts higher than y=x
  • intersects at anchor point
  • drops below y=x thereafter
80
Q

How is pitch defined in musical notes? What is the equal temperament?

A
  • Logarithmic frequency scale
  • equal temperament divides an octave logarithmically into 12 semitones, making the interval between adjacent notes to be 2^(1/12)
81
Q

What is speech intelligibility? And what 4 factors does it depend on?

A

It is a property referring to how well the meaning of a spoken message is transmitted to a listener

  • It depends on:
  • ability of the speaker to produce an acoustically and linguistically clear message,
  • how well the transmission channel is able to transmit the message,
  • how well the listener is able to recieve and analyse the message
  • environment noise
82
Q

What is spatial hearing?

A

It refers to human’s ability to localise sound source using our two ears

83
Q

How can localisation be described in terms of the source perception?

A

Localisation can be described by the perception of the source’s direction, distance and spatial extent

84
Q

What is binaural hearing? And what are the two cues?

A

Binaural hearing is listening to sounds where information due to interaural differences exists and is taken into account.

Binaural cues are derived from the difference in the signals between the two ears.
Two main acoustic cues:
- Interaural time difference - due to the difference of distance for a sound to propagate from the source to the ears
- Interaural level difference - due to the scattering (shadowing, reflection) of sound waves by the head and torso.

85
Q

Draw the diagram for perceived lateralization (ie direction L to R) against interaural delay (ITD) ranging about 0, with left ear recieving sound earlier at x=0

A

Diagonal s shape;

flat, y=x, then flat again

86
Q

What does relying just on ILD/ITD cause? What additional cues are used to solve this ambiguity?

A

It causes the ambiguity known as cone of confusion.
Additional cues:
- Spectral cues (using spectral change of sound caused by the shape of head including pinna)
- Dynamic cues (inducing changes of ITD/ILD by rotating the head to provide additional cues of source direction)

87
Q

What is the precedence effect?

A

It is an assisting mechanism of spatial hearing which suppresses the effect of reflections from walls, ceiling and floor in enclosed spaces which always arrive after the direct sound. Ie it designates a direction associated with the direction of the first-arrived sound.

88
Q

What are the two types of time-weighted sound level measurement?

A
  • Fast

- Slow

89
Q

Why is using a unit impulse for impulse response measurement less common? What is more common and why?

A
  • Because of the effect of noise. Since the energy of a unit impulse can reside only in the specific time instant, it is impossible to increase the energy of the signal. Therefore, the signal to noise ratio is less.
  • A swept sine signal (a stretched unit impulse) is more common because the energy can be spread across time to make the signal to noise ratio higher.
90
Q

What are the two types of swept sine signals?

A
  • Linear swept sine

- Log swept sine

91
Q

What is pseudo random noise and what is an example type?

A
  • Pseudo noise is a binary sequence generated with deterministic algorithm that exhibits statistical behaviour similar to a truly random sequence.
  • Maximum length sequence (MLS) is a type of pseudo random noise. As its power spectrum shows a quasi-uniform distribution, MLS is also used as the input for impulse response measurement.
92
Q

If the input to an object is already fixed, what is a way to estimate the impulse response and what is a way to estimate the frequency response?

A
  • Adaptive filter: it is a specific class of FIR that optimises its property to minimise a targeted value
  • Cross spectrum: measures the similarity between two signals. It can be used to estimate the frequency response of an unknown object
93
Q

What are the 3 components in a RIR of a reverberant room?

A
  • Direct sound
  • Early reflections
  • (late) reverberation
94
Q

What is reverberation time?

A

Time it takes to decay 60dB

95
Q

What are the two way to determine reverberation time? What is used more commonly?

A
  • Sabines formula
  • Energy decay curve

Sabines formula is only for ideal rooms. In practice, RT is determined from the measured RIR.

96
Q

What is the equation of EDC(t)?

A

EDC(t) = integral between t and inf of h(tau)^2 . d(tau)

97
Q

What are the three ways to define T60 RT? And how are they calculated?

A

Early decay time (between 0 and -10dB)
T20 (between -5 and -25dB)
T30 (between -5 and -35dB)

all still calculate time to decay 60dB

98
Q

How does RT vary with position in highly reverberant rooms vs rooms with absorbents

A

RT does not vary significantly with position in highly reverberant rooms, whereas it changes a lot depending on position in rooms with absorbents

99
Q

What is the DRR? And what is it’s equation?

A

Direct-to-reverberation ratio (DRR) is another metric associated with reverberation in a room.

It is given by the ratio between the energy of direct sound and reverberation.

DRR=10log(integral between 0 and tau of h(t)^2dt / integral between tau and inf of h(t)^2.dt)

100
Q

How does DRR relate to human hearing? Draw a diagram of DRR versus source distance to support your answer.

A
  • Human hearing estimates distance of sound source using DRR
  • One-to-one relationship between DRR and source distance
  • Shape: curved L asymptote
101
Q

What are the two ways (methods) to evaluate intelligibility

A
Subjective methods (listening by humans)
Objective methods (quantitative metrics)
102
Q

What is the speech transmission index? And what is it based on?

A

It is an objective metric that estimates the intelligibility of a transmission channel.
Based on modulation transfer function (MTF)

103
Q

What is the equation for the modulation transfer function?

A

m_k(w) = integral (0 to inf) of h_k(t)^2.exp(-jwt).dt / integral (0 to inf) h_k(t)^2.dt

where h_k(t) = impulse response filtered by k-th octave band
- and w is the 14 centre frequencies
104
Q

What is the equation of the apparent SNR from modulation transfer function? What is the other metric called?

A

SNR_k(w) = 10log(m_k/(1-m_k))

  • transmission index K_k(w)
105
Q

How is the STI calculated?

A

Sum of W_k.M_k
where W_k are given constants
M_k is averaged K_k

106
Q

Draw a graph of increasing STI versus decreasing SNR on bottom x axis and increasing T60 on top x axis

A

Flat, diagonal down, flat

107
Q

Draw a graph of intelligibility versus STI for spelled alphabet and a rhyme test

A

alphabet steep upwards then curls off at 100%

Rhyme gradual upward and curl at 100%

108
Q

What are the two objective metrics for speech intelligibility?

A

Clarity

Definition

109
Q

What is the clarity metric and how is it defined?

A

It measures the energy ratio between the early and late responses (same as DRR but with specific values for tau)

C50 (with tau=50)

= 10log((int (0 to 50ms) h(t)^2.dt) / (50 to inf) h(t)^2.dt))

110
Q

What is the definition metric and how is it defined? How can it be used to calculate C50?

A

Definition measures the energy ratio for the first 50ms of the impulse response to the entire impulse response

D50(%) = int (0 to 50ms) d(t)^2.dt / int (0 to inf) h(t)^2.dt

Can be derived from C50:

C50 = 10log(D50/(1-D50))

111
Q

How is binaurality realised?

A

It is realised when the RIR’s measured at the positions of easr carry binaural cues such as Interaural time difference (ITD) and Interaural level difference (ILD)

112
Q

What is the head related IR (HRIR) and head related transfer function (HRTF)? How can binaural hearing be reproduced?

A

HRIR represents the effects of the torso, head, and the external ear acoustics

HRTF is the frequency response from the HRIR

If the HRIR is available, binaural hearing can be easily reproduced by convolving the HRIR with an arbitrary sound source.

113
Q

What physics principle do drivers use?

A

Electro-dynamic principle (F=BIL). Wire in magnetic field is applied current which applies a force to it which moves a cone shaped diaphragm

114
Q

A driver is not good enough for low frequencies. How do we remedy this? What are the two types?

A

Driver is attached to an enclosure that resonates at low frequencies

Two types:
- Closed box (no holes other than that for the driver)
Bass-reflex (tube for low frequency sound to propagate)

115
Q

What do 2-way and 3-way loudspeakers consist of?

A

2 way:
Tweeter and woofer

3-way:
tweeter, mid-range and woofer

116
Q

Why are power amplifiers needed in loudspeakers?

A

They are needed to drive the driver of a loudspeaker (need to incur voltage to get a current)

117
Q

What are the 5 terminologies used to describe the specs of a loudspeaker

A
  • frequency response
  • frequency range
  • power handling (how much power it can handle before damage)
  • dimension
  • weight
118
Q

What is a microphone

A

It is a transducer that transforms a sound wave (pressure) propagating in air into an electrical signal (voltage)

119
Q

What are the three types of microphones?

A

Dynamic
Condenser
MEMS (Micro-Electrical Mechanical system)

120
Q

How does a dynamic microphone work?

A

Diaphram with a would coil is vibrated in a magnetic field which induces a voltage across the coil which is proportional to the velocity of the diaphragm

121
Q

How does a condenser microphone work?

A

It uses the properties of electret condesors (i.e. capacitors)

  • Sound pressure variations make one electrode of the condensor move, while the other is kept fixed.
  • If an electric charge is present between the electrodes, the change in distance between them will change the voltage between them too.
122
Q

How does a condenser, electret type microphone work?

A

It has electric charge trapped between the electrodes, eliminating the need for a power supply.

123
Q

What is a MEMS microphone and how does it work?

A
  • MicroElectrical-Mechanical System
  • Pressure sensitive diaphragm is ethed onto a silicon wafer and is accompanied by an integrated preamplifier
  • Digital MEMS have inbuilt A/D converter on the same chip.
  • very small (suitable for mobile phones etc)
124
Q

What are the pros, cons and applications for dynamic microphones?

A

Pros:

  • No power supply needed
  • Robust
  • Cheap

Cons:

  • Low sensitivity
  • Low quality
  • Can be large

Applications:

  • Outdoor use
  • Public address system/concert
125
Q

What are the pros, cons and applications for condensor microphones?

A

Pros:

  • High sensitivity
  • high quality

Cons:

  • power supply needed (except electret type)
  • Vulnerable to vibration/humidity
  • Expensive

Applications:

  • Indoor use
  • Computer/electronic devices
126
Q

What are the pros, cons and applications for MEMS microphones?

A

Pros:

  • Very small
  • Robust

Cons:
- Expensive

Applications:

  • Mobile devices
  • Microphone arrays
127
Q

What are the 7 specifications used to describe a microphone?

A
  • Sensitivity
  • Dynamic range
  • frequency response
  • Directivity
  • SNR
  • Input impedance
  • Maximum SPL
128
Q

What are the three ways to categorise microphones

A
  • principle
  • directivity
  • application and purposes
129
Q

What is microphone directivity?

A

Directivity specifies a microphone’s ability to spatially discriminate sound arriving from different angles

130
Q

What are the four most common polar patterns

A
  • Omni-directional
  • Uni-directional (cardioid)
  • Bi-directional
  • Shotgun
131
Q

Draw the polar patterns for omni-directional, uni-directional (cardioid), bi-directional and shotgun

A

Omni: circle with constant radius
Uni: Looks like a bum
Bi: looks like an 8
Shotgun: looks like a vertical sword with the blade thicker than the handle and the horizontal handle finger slicer protectors thinner than the handle.

132
Q

What are 6 types of microphones (for the different purposes)? (not talking about condensor etc) also, what are these types for?

A
  • Lapel (typically omni or uni (cardioid), worn by a speaker)
  • Boundary (typically omni, for meetings)
  • Wireless
  • Hydrophone (for underwater)
  • Stereo (two unidirectional to capture sound from left or right direction)
  • Bone conduction
133
Q

What are 3 types of microphone connectors

A

XLR
1/4in phone
3.5mm/2.5mm

134
Q

What are the two reasons why we need to measure mics?

A
  • To know the actual characteristics

- To measure other objects (mic responses are included in the measurements of other objects)

135
Q

Where do we measure mics/loudspeakers?

A

anechoic chamber

136
Q

How is the freq response of a microphone calculated after compensation?

A

FR of mic after compensation = raw measurement - FR of loudspeaker measured by calib mic

137
Q

How is directivity pattern measured?

A
  • Playing a sinewave while the device is rotated at a constant speed.
  • Turntable is used.
  • Directivity pattern is measured at different frequencies by changing the frequency of the sine wave.
  • Typically 1/3 octave band frequencies used.
138
Q

What are microphone arrays and their application?

A

Microphone arrays are a set of more than one microphone that is able to detect/utilise spatial information about sound sources

  • used to estimate direction/location of sound sources
139
Q

How do microphone arrays work? What is TDOA?

A

Sound wave arrives at each microphone at different timing causing time difference of arrival (TDOA)

TDOA is a function of the angle of the sound source WRT the array.

140
Q

What is beamforming?

A

Signal processing technique that forms a directivity pattern

Consists of FIR filters connected to each microphone and an adder

Directivity can be electronically varied by changing the property of the filters.

141
Q

What are three techiques of beamforming?

A
  • Delay-sum beamformer
  • Steered beamformer method
  • Cross correlation method
142
Q

What is the simplest technique used to implement beamforming?

A

Simplest technique is Delay-sum (FIR filters designed to delay the signals so that the target signal is aligned across the microphones)

143
Q

What is the steered beamformer method? Draw a diagram of sound energy versus angle

A

The steered beamformer method estimates source direction by measuring sound energy arriving from different angles

Diagram looks sharply normally distributed about angle theta.

144
Q

What is the cross-correlation method for beamforming?

A

It calculates the time difference of arrival (TDOA) by comparing the measurements between two microphones

145
Q

What is an acoustic camera?

A

Imaging device used to locate sound sources and characterise them using a microphone array with beamforming