Audio Analysis and Assessment Flashcards

1
Q

Audio represents…

A

sound pressure changes over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sound is converted into what, by a transducer?

A

Voltage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Audio can be … or …

A

Continuous or Discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous signal represents…

A

Real world sound pressure variations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

An example of continuous signal equipment is…

A

Analogue Equipment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Discrete signal represents…

A

sound as a series of ones and zeros

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

ADC stands for…

A

Analogue to Digital Converter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sampling frequency is found on what axis?

A

X-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sampling frequency is…

A

Number of samples taken per second

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling frequency is measured in…

A

Hertz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Amplitude quantisation is found on what axis?

A

Y-axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Amplitude quantisation is a…

A

Binary Encoding Scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In amplitude quantisation, the number of bits dictate…

A

The number of levels we can represent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the amplitude quantisation bit equation?

A

2^n
n = the number of bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are three examples of digital audio files?

A

WAV, AIFF, AU

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In digital audio files, the Y-axis could represent? (4)

A
  1. Normalised (-1 - 1)
  2. Sample Value
  3. dB
  4. Percentage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In digital audio files, the X-axis could represent? (2)

A
  1. Time
  2. Samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does PCM stand for?

A

Pulse Code Modulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Aspects of the PCM encoding process directly affect…

A

Signal quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Digital data can only represent…

A

A finite set of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Digital data’s finite set of values are set by…

A

The number of bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is digital value?

A

The nearest approximation of the analogue signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Approximation introduces what to each sample?

A

Quantisation error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is quantisation error?

A

It is the difference between the analogue input signal and the quantised level assigned by the encoder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Quantisation =

A

Approximation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

When is the maximum quantisation error reached?

A

At a half step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does quantisation error create?

A

Quantisation Noise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does SNR stand for?

A

Sound to Noise Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

As SNR increases, noise…

A

Decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

As SNR decreases, the distance between signal and noise…

A

Decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What does SQNR stand for?

A

Sound to Quantisation Noise Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What two factors does SQNR have?

A
  1. Number of bits encoding audio
  2. Input signal amplitude
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the SQNR equation?

A

SQNR = (6.02*B)+1.76dB, where B = number of bits (16, 24, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Under what two conditions of the input signal makes quantisation noise similar to white noise?

A

When signal has large amplitudes
or
When signal has wide bandwidth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What two problems occur when input signal has a low amplitude?

A
  1. Relative magnitude of distortion increases (SQNR decreases)
  2. Quantisation noise is correlated with the input signal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Whats the difference between quantisation distortion and white noise?

A

Distortion is more annoying due to its its unpredictability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are the two ways to reduce quantisation noise?

A
  1. Increasing bit depth
  2. Dither
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

How does increasing bit depth decrease reduce quantisation noise?

A

Each additional bit increases SQNR by 6dB (halving QN)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Increasing bit depth causes what issue?

A

Increasing bit depth increases processing burden

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is dithering?

A

Adding noise to signal before sampling to reduce the audible effect of quantisation error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

As well as reducing the audible effects of quantisation error, dithering does what at low amplitudes?

A

Randomises quantisation error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Why does dither work even though quantisation error can still be audible?

A

Noise is easier to listen to than distortion so dither helps make audio less annoying

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Most audio we hear is… (hint - digital files, streaming)

A

Compressed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Noise is created when quantisation depth is manipulated by…

A

Compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Nyquist frequency is…

A

Half of sampling rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Signals sampled at discrete intervals have…

A

An upper limit to frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

When above Nyquist frequency, there is a period between…

A

Samples to reproduce the input signal correctly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is aliasing?

A

When frequencies greater than Nyquist frequency appear as lower frequencies within the spectrum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What happens when sampling at twice the highest frequency in the spectrum?

A

A correct representation of all frequency spectrum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Aliasing can be looked from both…

A

A time and frequency domain perspective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Aliasing can be avoided by having at least how many samples per cycle of waveform?

A

Two

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

When does aliasing occur? (2)

A
  1. When sample rate is too low
  2. When signal with twice the sampling frequency is observed by system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Aliasing introduces what to audio?

A

Unwanted frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is the aliasing equation?

A

Af = Fs - F

Fs = sampling frequency
F = input frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Aliasing affects what frequencies?

A

All frequencies above Nyquist frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Sampling process is called…

A

Pulse Code Modulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What occur around carrier frequency?

A

Sidebands

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Sidebands occur around carrier if bands arent…

A

Limited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Sidebands make output spectrum…

A

Complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is the sideband equation?

A

(n * Fc) +/- Fm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

In terms of sidebands, what component of audio is the carrier and what component is the modulator?

A

Audio signal = Modulator
Sampling frequency = Carrier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

The input signal spectrum forms sidebands around…

A

Integer multiples of the sampling frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

When do sidebands move closer together (overlap)?

A

When sampling frequency is less than twice the highest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

When do sidebands increase in width (overlap)?

A

When audio signal is greater than Nyquist frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Anti-aliasing filters remove…

A

Frequencies above Nyquist frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q
  1. abs() function is used for measuring…
  2. Why?
A
  1. Peak on bipolar waves
  2. abs() ignores negative values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q
  1. How do we measure dB?
  2. If amplitude decreases by half, what is the change in dB?
A
  1. 20log(a/b)
  2. -6dB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What is the dB change for every bit increased?

A

6dB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What does RMS stand for?

A

Root Mean Square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

What does RMS represent?

A

Distribution of sample values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

What info does RMS give us?

A

Average energy/power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

RMS can be affected by…

A

Compression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What is the crest equation?

A

Crest = 20log(peak amplitude / RMS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

The ratio between peak amplitude and RMS is called…

A

Crest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Crest controls…

A

Relationship between average energy and peak values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

What are the equations for frequency and period?

A

f = 1 / T
T = 1 / f

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Why do audio signals change dynamically over time?

A

Because amplitude and frequency change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What is based on frequency, amplitude and time parameters?

A

Human hearing response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q
  1. Distinguishing separate frequencies throughout audible frequency range isn’t…
  2. What is the term for the above?
A
  1. Constant
  2. Discrimination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

As well as distinguishing separate frequencies throughout frequency range not being constant, what else is not constant?

A

Sensitivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Amplitude response has a…

A

Very large dynamic range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

What is the threshold of feeling in dB?

A

120dB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Give an example of a non-linear graph.

A

Fletcher Munson curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

The Fletcher Munson curve shows…

A

Non-linear sensitivity over frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

As frequency increases, resolution…

A

Decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Humans find it harder to discriminate … frequencies.

A

Higher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Log scales and constant Q reflect…

A

Human perception of frequency/pitch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

What is constant Q?

A

Relation of bandwidth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

As band centre frequency increases, frequency…

A

Increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

As bandwidth increases, frequency…

A

Increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q
  1. What is the equation for Q?
  2. What is heavy cool about this?
A
  1. Q = centre frequency / bandwidth
  2. Q will always remain constant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

What two things are crucial to audio processing operations?

A
  1. Frequency
  2. Amplitude
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

What does audio frequency analysis do?

A

Extract frequency from signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Audio frequency analysis describes…

A

Frequency and amplitude over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

What is the most common approach to extract frequency information?

A

Fourier Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Our boy, Fourier, stated - ‘Any periodic function may be represented as…

A

An infinite series of harmonically related sinusoids’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

In terms of Fourier, an input signal is a combination of…

A

Harmonically related sinusoids

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

Why do we want good frequency resolution?

A

To see down to the individual frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

Why do we want good time resolution?

A

To see down to a few milliseconds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

We can think of Fourier analysis frequency resolution as…

A

A series of frequency bands or filters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q
  1. In Fourier analysis frequency resolution, bands are…
  2. Unlike…
A
  1. Spaced linearly
  2. Human hearing system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

Analysis bins refer to…

A

Bands

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

Frequency resolution is determined by…

A

The number of samples of the input signal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Close spaced frequencies separate when…

A

Filters narrow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

To increase accuracy, we can increase what three things?

A
  1. Transform
  2. Samples
  3. Frequency Resolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

What is the bin bandwidth equation?

A

Band bandwidth = Fs / length of transform (in samples)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

What is the bin centre frequency equation?

A

Bin centre frequency = n * bin bandwidth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

What is the length of transform equation?

A

Length of transform = Fs * t (seconds)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

What is the window duration equation?

A

Window duration = number of samples * sample period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

What is the sample period equation?

A

Sample period = 1 / Fs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

What problem arises with frequency and time resolution?

A
  1. Good frequency resolution results in bad time resolution
  2. Good time resolution results in bad frequency resolution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

If we analyse a whole track (3 mins), would we have good frequency or good time resolution?

A

Good frequency resolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

If we analyse a short segment (0.1 seconds), would we have good frequency or good time resolution?

A

Good time resolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Does time resolution or frequency resolution have a smaller computational expense?

A

Time resolution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

Fourier analysis is … on the computer

A

Strenuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

What method is faster than Fourier analysis?

A

Fast Fourier Transform (FFT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

FFT requires transform length to be…

A

to the power of two (256, 1024, 2048 samples)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

FFT requires what to be to the power of two?

A

Transform length

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

A window size of power of two will result in…

A

Faster processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

What is windowing?

A

A series of short analytical snippets throughout duration of signal

121
Q

Windowing describes…

A

The evolution of frequency over time

122
Q

Window still has a problem. What is it?

A

Frequency and time resolution trade off

123
Q

Time resolution can be increased by overlapping…

A

Windows

124
Q

What do spectrograms plot?

A

Analytical window over time

125
Q

What does a spectrograms X and Y axis show?

A

X = Time
Y = Frequency

126
Q

What does colour on a spectrogram represent?

A

Magnitude (Amplitude)

127
Q

What problems does frequency analysis have? (4)

A
  1. Results are estimates
  2. Computationally expensive
  3. Windowing can confuse frequency readings
  4. Doesn’t reflect human hearing
128
Q

In terms of windowing, instead of reading signal spectrum, we get…

A

A combination of signal and window spectrum

129
Q

What is ‘SpEcTrAl LeAkAgE’?

A

Unwanted Artefacts

130
Q

Spurious Components are referred to as…

A

Side lobes

131
Q

How can we reduce unwanted artefacts?

A

Use different window shapes

132
Q

FFT has a good frequency response at…

A

Low frequencies

133
Q

As window decreases, frequency resolution…

A

Decreases

134
Q

FFT has good time resolution…

A

Throughout whole spectrum

135
Q

As window decreases, time resolution…

A

Increases

136
Q

Time and frequency resolution trade off can be resolved by…

A

Using adaptive window sizes

137
Q

In terms of multi-resolution analysis, smaller windows would be used for…

A

Higher frequencies

138
Q

In terms of multi-resolution analysis, we aim to have good frequency resolution at…

A

Lower frequencies

139
Q

In terms of multi-resolution analysis, we aim to have good time resolution at…

A

Higher frequencies

140
Q

In terms of multi-resolution analysis, window size varies with…

A

Frequency

141
Q

Whats the benefits of multi-resolution analysis? (2)

A
  1. Resolves trade off
  2. Can increase time and/or frequency resolution where it matters
142
Q

Two key parameters of PCM are…

A
  1. Sample rate
  2. Bit depth
143
Q

What is the formula for data per second using values from the following - 1 second of stereo PCM audio at 44.1kHz, 16 bit?

A

44,100 * 2 (bytes) * 2 (stereo) = 176.4kBps

144
Q

What is the formula for bits per second using 176.4kB?

A

176.4kB * 8 = 1.4Mbps

145
Q

What does perceptual audio aim to do?

A

Reduce data required to represent audio

146
Q

What do we call the process of cochlea hairs responding to strongest stimuli and ignoring anything weaker?

A

Masking

147
Q

Stimuli temporarily raises…

A

Threshold of hearing

148
Q

What are critical bands? (The Beatles aren’t one of them)

A

Areas influenced by the temporary change in threshold of hearing

149
Q

Critical bands are … at lower frequencies.

A

Narrower

150
Q

What pattern appears across hearing range?

A

Constant Q pattern

151
Q

What is the CB bandwidth equation?

A

CB bandwidth = 94 + ( 71 * f^3/2 )

f = kHz

152
Q

CB bandwidth is not … at frequencies.

A

Fixed

153
Q

CB depends on what two components of stimuli?

A

Intensity and frequency

154
Q

What does critical band response aid? (5)

A
  1. Frequency discrimination
  2. Perceived loudness
  3. Dissonance/Consonance
  4. Clarity of speech
  5. Masking
155
Q

Scales representing spectral energy in … … help measure human perception.

A

Critical bands

156
Q

Two common scales of CB response are…

A
  1. Bark
  2. Mel
157
Q

What does Bark scale aim to measure?

A

Loudness

158
Q

One critical band has the bandwidth of how many barks?

A

One

159
Q

What does Mel scale aim to measure?

A

Perceived pitch

160
Q

What do Bark and Mel scales help us to establish?

A

Sounds both audible and inaudible in signal

161
Q

Bark and Mel scales underpin…

A

Masking

162
Q

Where does masking occur in terms of frequency?

A

Specific range in frequency around tone (critical band)

163
Q

Masking means that frequency in same range might be…

A

Inaudible

164
Q

In terms of masking, what is the ‘Masker’?

A

Louder tone

165
Q

In terms of masking, what is the ‘Maskee’?

A

Quieter tone

166
Q

Masking is better as frequency…

A

Increases

167
Q

As masker amplitude increases, masking curve becomes…

A

Broader

168
Q

In terms of masking, temporary threshold increase…

A

Holds over given time

169
Q

Masking threshold increase lasts longer when… (4)

A
  1. Masker is louder
  2. Masker and maskee are closer in frequency
  3. Masker has lower frequency than maskee
  4. Time between tones are shorter
170
Q

What is backwards masking?

A

Sounds can be masked by tone which occurs after maskee

171
Q

Backwards masking suggests that humans hear in…

A

Time frames

172
Q

Backwards masking only occurs when both tones are in…

A

The same time block

173
Q

What is the bits per sample equation?

A

bits per sample = bit rate / Fs

174
Q

What is the key mechanism in perceptual codec?

A

Bit allocation

175
Q

What is data reduction?

A

Dynamically altering number of bits used to represent signal to make less computationally demanding

176
Q

As bits decrease, noise…

A

Increases

177
Q

In adaptive allocation, loud tones get what to represent them?

A

More bits

178
Q

In adaptive allocation, what aren’t encoded?

A

Inaudible tones

179
Q

In adaptive allocation, what happens to quantisation error noise?
How?

A
  1. Its masked
  2. By keeping under the threshold
180
Q

What does compressed audio look like to a computer?

A

Instructions on how to reconstruct the waveform

181
Q

Input frames are split into how many with signals with transients?
How many samples does each segment frame have?

A
  1. Three
  2. 384 samples
182
Q

Input frames are split into how many with static signals?
How many samples does each segment frame have?

A

Trick question
1. One frame
2. 1152 samples

183
Q

As frame size decreases, noise…

A

Decreases

184
Q

Is encoding process perfect?

A

No

185
Q

In smaller frames, what issue occurs around transients?

A

Noise

186
Q

What happens when noise occurs before transient?

A

Transient is smeared resulting in loss of definition

187
Q

Input signal is split into side bands. How many and what do each bandwidth have in common?

A
  1. 32
  2. Equal bandwidth
188
Q

Sub bands result in…

A

32 separate band-limited time domain signals (really rolls off the tongue)

189
Q

Do sub bands increase data?

A

Doesn’t increase data due to ‘polyphase sub-band filter’

190
Q

What effect does the ‘polyphase sub-band filter’ have?

A

Down sampling effect

191
Q

What does the ‘polyphase sub-band filter’ do? (2)

A
  1. Reduces Fs
  2. While splitting signal in sub bands
192
Q

What sample frames are subject to frequency analysis?

A

All frames

193
Q

What does frequency analysis do to sub band content?

A

Converts content into frequency domain data

194
Q

What does MDCT stand for?

A

Modified Discrete Cosine Transform

195
Q

How much data does MDCT need to reproduce data compared to FFT?

A

Half the data

196
Q

In the frequency domain, the masking model level is calculated for…

A

Each sub-band

197
Q

What does SMR stand for?

A

Signal to mask ratio

198
Q

In the masking model, frequency domain data can be used to give us what ratio?

A

Signal to mask ratio

199
Q

What are the stages of the masking model? (5)

A
  1. Masking level calculated for each sub-band
  2. Calculation for SMR
  3. Bit allocation to sub-bands
  4. No. of bits assigned to sub-bands dependent of SMR
  5. Bit depth varies across sub-bands due to content
200
Q

What is the encoding process order? (6)

A
  1. Frames
  2. Sub-bands
  3. Down sampling
  4. MDCT
  5. Masking and bit allocation
  6. Huffman coding
200
Q

What is Huffman coding?

A

Statistical compression for further data reduction

200
Q

What does Huffman coding represent?

A

Repeated sequences of data using shorter code eg 11010101 is stored as 01

201
Q

What does compressed audio contain? (5)

A
  1. Instructions for decoder
  2. Samples in MDCT domain at reduced bit depth
  3. Bit allocation data
  4. Scale factor for each sub-band
  5. encoded using Huffman coding
202
Q

What processes do frequency transforms have?

A

Inverse equivalent processes

203
Q

What does the decoder apply to produce a time domain signal?

A

An inverse MDCT

204
Q

What is simpler, decoder or encoder?

A

Decoder

205
Q

Compression artefacts are…

A

Complex

206
Q

What happened to sub-band data when decoded?

A

Data is combined

207
Q

What two ways do compression artefacts vary?

A
  1. vary systematically with audio input
  2. Vary according to encoding
208
Q

Artefacts increase as bit rate…

A

Decreases

209
Q

What is technical quality?

A

Our understanding of good audio quality

210
Q

Audio engineers might want to access the output of what? (3)

A
  1. Compression algorithms
  2. Hardware systems
  3. Network Codex
211
Q

What is subjective audio quality assessment?

A

Listening tests taken by panel of listeners

212
Q

What is objective audio quality assessment?

A

Analysis of audio signals - based on observational phenomena

213
Q

What are the pros of subjective audio quality assessment? (1)

A
  1. Most accurate results
214
Q

What are the cons of subjective audio quality assessment? (4)

A
  1. Expensive
  2. Time consuming
  3. Subjective
  4. Complex planning
215
Q

What are the pros of objective audio quality assessment? (4)

A
  1. Lower cost
  2. Lower complexity
  3. Consistent (no listeners)
  4. Less time required
216
Q

What are the cons of objective audio quality assessment? (1)

A

1.It is an estimation of human response

217
Q

In audio quality assessment, what two things are compared?

A

Original and processed signal

218
Q

In terms of audio quality assessment, less change in signals means…

A

Better quality

219
Q

Why aren’t time domain comparisons helpful?

A

We aren’t sensitive to phase changes

220
Q

Comparing SNR, segmental SNR and total harmonic distortion don’t resemble…

A

Human hearing response to these parameters

221
Q

What does LSD stand for?

A

Log-squared spectral distance

222
Q

What does LSD produce?

A

Large values for low power areas on spectrum

223
Q

Whats the negative of LSD? (apart from the comedown)

A

It is too sensitive for spectral changes which are inaudible

224
Q

What are meaningful differences?

A

Spectral features which characterise signals

225
Q

What are formant peaks?

A

Cluster of energy around certain points in frequency

226
Q

Formant can help us differentiate what two things?

A
  1. Speech
  2. Musical instruments
227
Q

The human hearing range is sensitive to…

A

Formant peaks

228
Q

What is the minimum change in frequency (%) humans can hear a difference in pitch? (worded that one badly, soz)

A

3-5%

229
Q

Humans can hear a difference when bandwidth shifts …-…%

A

20-40%

230
Q

What does SKL stand for?

A

Symmetrical Kullback-Leibler Distance

231
Q

What sort of coding does SKL use for a smooth formant based spectrum?

A

Linear prediction coding

232
Q

SKL uses linear prediction coding to achieve a…

A

smooth formant based spectrum

233
Q

What does SKL assume?

A

Formant changes will be perceivable

234
Q

SKL emphasises differences in what two parameters?

A
  1. High magnitudes
  2. Low frequencies
235
Q

SKL is less sensitive to…

A

High frequency shifts

236
Q

What does MFCC stand for?

A

Mel Frequency Cepstral Coefficients

237
Q

MFCC is a subjective spectrum which reflects…

A

How we hear sounds

238
Q

What does MFCC use to reflect how we hear sounds?

A

Psychoacoustical phenomena

239
Q

Cepstrum is equal to…

A

Inverse FFT of the log FFT of a signal (duh)

240
Q

What is inverse FFT of the log FFT of a signal equal to?

A

Cepstrum

241
Q

What does cepstrum emphasise?

A

Pitch content

242
Q

What does MFCC combine? (2)

A
  1. Cepstrum
  2. Mel
243
Q

Changes in MFCC are…

A

Perceivable

244
Q

What is the auditory transform stage?

A

MFCC

245
Q

What is the gear meshing equation?

A

Fm = nt * Frg

nt = no. of teeth
Frg = speed of gear

246
Q

Periodograms help emphasise…

A

Pitch

247
Q

What are the three stages of the PSD process?

A
  1. Compare signals with itself
  2. Take FFT of results
  3. Peaks will be produced at frequency of periodic elements
248
Q

What is acoustic ecology?

A

Environmental sound

249
Q

What does NVH stand for?

A

Noise, Vibration, Harshness

250
Q

What is an example of active sound design?

A

Ford Mustang mic up engine and gives user option to change between sports and comfort mode (changing volume of ‘engine’)

251
Q

Product sound impacts… (3)

A
  1. Perceived quality
  2. Purchase
  3. Design and manufacture
252
Q

What is cross modal perception?
An example?

A
  1. When perception is affected by two or more senses
  2. Louder = more powerful
253
Q

Perception is measured by… (3)

A
  1. Loudness
  2. Roughness
  3. Sharpness
254
Q

What does loudness measure and what is its unit?

A

Measure of energy across critical bands (Sone)

255
Q

What does roughness measure and what is its unit?

A

Rapid amplitude fluctuations by interacting sounds (Asper)

256
Q

What does sharpness measure and what is its unit?

A

Weighting/shape of spectrum (Acum)

257
Q

What two parameters are in response of critical bands?

A
  1. Roughness
  2. Sharpness
258
Q

As frequency energy increases, sharpness…

A

Increases

259
Q

Where does sharpness occur?

A

In one critical band with concentrated high frequency energy

260
Q

What is the term for low frequency sharpness?

A

Booming

261
Q

What does CSA stand for?

A

Category Scaling of Annoyance

262
Q

What is CSA used for?

A

Measuring annoyance of sound

263
Q

What is the CSA equation?

A

CSA = 8.07 + ( 0.563 * N5 ) + ( 3.022 * S50 ) + ( 2.175 * R )

N = Loudness
S = Sharpness
R = Roughness

264
Q

What does MIR stand for?

A

Music Information Retrieval

265
Q

What is an example of tech that uses MIR?

A

Melodyne

266
Q

What is the issue with query by humming?

A

Variation of time and pitch in humming might not be recognised

267
Q

What is the solution to the ‘query by humming’ issue?

A

Parsons code

268
Q

What is parsons code?

A

Codes notes changes so that system recognises C, C#, D as tonic, up, up (sorry if that ones confusing)

269
Q

What is ‘query of example’? Give an example of tech that uses it.

A
  1. Looks for closest match by extracting compact and descriptive set of acoustic features
  2. Shazam
270
Q

What are the challenges of ‘query by example’? (3)

A
  1. Database has millions of files so data must be compact
  2. Fingerprints must be robust enough to ignore noise
  3. Process must be efficient
271
Q

What do constellation maps do? (2)

A
  1. Finds local maxima (peaks)
  2. Encodes peaks as time and frequency coordinates
272
Q

What would you use if peaks overlap in constellation maps?

A

Use hashing process

273
Q

What does the hashing process do? (2)

A
  1. Helps identify spectral features unique to music track
  2. Speeds up process
274
Q

What three forms of analysis can be used for classification?

A
  1. Audio
  2. Metadata
  3. Symbolic Data
275
Q

In term of classification, what are the two benefits of using audio data?

A
  1. Easy to get ahold of
  2. Can extract timber and acoustics easily
276
Q

In term of classification, what is the con of using audio data?

A

Hard to precisely identify some features

277
Q

In term of classification, what is the benefit of using symbolic data?

A

More detailed

278
Q

In term of classification, what are the cons of using symbolic data? (2)

A
  1. No acoustic/timbre data
  2. Difficult to represent whole song in MIDI
279
Q

What information can be gathered from spectrograms? (4)

A
  1. Timbre
  2. Frequency
  3. Intensity
  4. Rhythmic features
280
Q

What are the two approaches of audio content analysis?

A
  1. Spectrogram
  2. Frame-based approach
281
Q

Spectral shape gives us what four parameters? (4)

A
  1. Brightness
  2. Centroid
  3. Flatness
  4. Skewness
282
Q

What is spectral flux?

A

Change of spectra over time

283
Q

What would you use to identify chords in audio? (2)

A
  1. Spectrogram
  2. Pitch histograms
284
Q

How would you identify chords in audio?

A

Calculate average energy for each note across spectrum

285
Q

What does ‘classify by content’ do?

A

Classifies high level content using low level parameters

286
Q

What is the ‘classify by content’ process? ( 4)

A
  1. Get audio
  2. Group
  3. Find ground truths
  4. Classify using ground truths
287
Q

When gathering audio for classification, what should the audio be?

A

Typical to the genre

288
Q

What are three methods of pattern machine learning?

A
  1. KNN
  2. GMM
  3. SUM
289
Q

What does KNN stand for?

A

K Nearest Neighbour

290
Q

What is the KNN equation?

A

KNN = square root of ( A - B ) ^2

291
Q

In terms of KNN, if K = 3, how many smallest distance tracks would you choose?

A

Three

292
Q

In terms of KNN, as K increases, neighbours should…

A

Increase

293
Q

In terms of KNN, less neighbours can produce…

A

Clearer boundaries

294
Q

In terms of KNN, the more neighbours there are…

A

The better the class represents

295
Q

What is the content based problem?

A

Acoustical properties aren’t taken into account so there might be similarities in acoustics rather than music

296
Q

What is the content based problem called?

A

Glass Ceiling