Audio Analysis and Assessment Flashcards

Question

Quantisation =

Answer 1

Approximation

Answer 2

At a half step

Answer 3

Quantisation Noise

Answer 4

Sound to Noise Ratio

Answer 5

Sound to Quantisation Noise Ratio

Answer 6

1. Number of bits encoding audio 2. Input signal amplitude

Answer 7

SQNR = (6.02*B)+1.76dB, where B = number of bits (16, 24, etc.)

Answer 8

When signal has large amplitudes or When signal has wide bandwidth

Answer 9

1. Relative magnitude of distortion increases (SQNR decreases) 2. Quantisation noise is correlated with the input signal

Answer 10

Distortion is more annoying due to its its unpredictability

Answer 11

1. Increasing bit depth 2. Dither

Answer 12

Each additional bit increases SQNR by 6dB (halving QN)

Answer 13

Increasing bit depth increases processing burden

Answer 14

Adding noise to signal before sampling to reduce the audible effect of quantisation error

Answer 15

Randomises quantisation error

Answer 16

Noise is easier to listen to than distortion so dither helps make audio less annoying

Answer 17

Compressed

Answer 18

Compression

Answer 19

Half of sampling rate

Answer 20

An upper limit to frequencies

Answer 21

Samples to reproduce the input signal correctly

Answer 22

When frequencies greater than Nyquist frequency appear as lower frequencies within the spectrum

Answer 23

A correct representation of all frequency spectrum

Answer 24

A time and frequency domain perspective

Answer 25

1. When sample rate is too low 2. When signal with twice the sampling frequency is observed by system

Answer 26

Unwanted frequencies

Answer 27

Af = Fs - F Fs = sampling frequency F = input frequency

Answer 28

All frequencies above Nyquist frequency

Answer 29

Pulse Code Modulation

Answer 30

(n * Fc) +/- Fm

Answer 31

Audio signal = Modulator Sampling frequency = Carrier

Answer 32

Integer multiples of the sampling frequency

Answer 33

When sampling frequency is less than twice the highest frequency

Answer 34

When audio signal is greater than Nyquist frequency

Answer 35

Frequencies above Nyquist frequency

Answer 36

1. Peak on bipolar waves 2. abs() ignores negative values

Answer 37

1. 20log(a/b) 2. -6dB

Answer 38

Root Mean Square

Answer 39

Distribution of sample values

Answer 40

Average energy/power

Answer 41

Compression

Answer 42

Crest = 20log(peak amplitude / RMS)

Answer 43

Relationship between average energy and peak values

Answer 44

f = 1 / T T = 1 / f

Answer 45

Because amplitude and frequency change

Answer 46

Human hearing response

Answer 47

1. Constant 2. Discrimination

Answer 48

Sensitivity

Answer 49

Very large dynamic range

Answer 50

Fletcher Munson curve

Answer 51

Non-linear sensitivity over frequency

Answer 52

Human perception of frequency/pitch

Answer 53

Relation of bandwidth

Answer 54

1. Q = centre frequency / bandwidth 2. Q will always remain constant

Answer 55

1. Frequency 2. Amplitude

Answer 56

Extract frequency from signal

Answer 57

Frequency and amplitude over time

Answer 58

Fourier Analysis

Answer 59

An infinite series of harmonically related sinusoids'

Answer 60

Harmonically related sinusoids

Answer 61

To see down to the individual frequencies

Answer 62

To see down to a few milliseconds

Answer 63

A series of frequency bands or filters

Answer 64

1. Spaced linearly 2. Human hearing system

Answer 65

The number of samples of the input signal

Answer 66

Filters narrow

Answer 67

1. Transform 2. Samples 3. Frequency Resolution

Answer 68

Band bandwidth = Fs / length of transform (in samples)

Answer 69

Bin centre frequency = n * bin bandwidth

Answer 70

Length of transform = Fs * t (seconds)

Answer 71

Window duration = number of samples * sample period

Answer 72

Sample period = 1 / Fs

Answer 73

1. Good frequency resolution results in bad time resolution 2. Good time resolution results in bad frequency resolution

Answer 74

Good frequency resolution

Answer 75

Good time resolution

Answer 76

Time resolution

Answer 77

Fast Fourier Transform (FFT)

Answer 78

to the power of two (256, 1024, 2048 samples)

Answer 79

Transform length

Answer 80

Faster processing

Answer 81

A series of short analytical snippets throughout duration of signal

Answer 82

The evolution of frequency over time

Answer 83

Frequency and time resolution trade off

Answer 84

Analytical window over time

Answer 85

X = Time Y = Frequency

Answer 86

Magnitude (Amplitude)

Answer 87

1. Results are estimates 2. Computationally expensive 3. Windowing can confuse frequency readings 4. Doesn't reflect human hearing

Answer 88

A combination of signal and window spectrum

Answer 89

Unwanted Artefacts

Answer 90

Side lobes

Answer 91

Use different window shapes

Answer 92

Low frequencies

Answer 93

Throughout whole spectrum

Answer 94

Using adaptive window sizes

Answer 95

Higher frequencies

Answer 96

Lower frequencies

Answer 97

Higher frequencies

Answer 98

1. Resolves trade off 2. Can increase time and/or frequency resolution where it matters

Answer 99

1. Sample rate 2. Bit depth

Answer 100

44,100 * 2 (bytes) * 2 (stereo) = 176.4kBps

Answer 101

176.4kB * 8 = 1.4Mbps

Answer 102

Reduce data required to represent audio

Answer 103

Threshold of hearing

Answer 104

Areas influenced by the temporary change in threshold of hearing

Answer 105

Constant Q pattern

Answer 106

CB bandwidth = 94 + ( 71 * f^3/2 ) f = kHz

Answer 107

Intensity and frequency

Answer 108

1. Frequency discrimination 2. Perceived loudness 3. Dissonance/Consonance 4. Clarity of speech 5. Masking

Answer 109

Critical bands

Answer 110

1. Bark 2. Mel

Answer 111

Perceived pitch

Answer 112

Sounds both audible and inaudible in signal

Answer 113

Specific range in frequency around tone (critical band)

Answer 114

Louder tone

Answer 115

Quieter tone

Answer 116

Holds over given time

Answer 117

1. Masker is louder 2. Masker and maskee are closer in frequency 3. Masker has lower frequency than maskee 4. Time between tones are shorter

Answer 118

Sounds can be masked by tone which occurs after maskee

Answer 119

Time frames

Answer 120

The same time block

Answer 121

bits per sample = bit rate / Fs

Answer 122

Bit allocation

Answer 123

Dynamically altering number of bits used to represent signal to make less computationally demanding

Answer 124

Inaudible tones

Answer 125

1. Its masked 2. By keeping under the threshold

Answer 126

Instructions on how to reconstruct the waveform

Answer 127

1. Three 2. 384 samples

Answer 128

Trick question 1. One frame 2. 1152 samples

Answer 129

Transient is smeared resulting in loss of definition

Answer 130

1. 32 2. Equal bandwidth

Answer 131

32 separate band-limited time domain signals (really rolls off the tongue)

Answer 132

Doesn't increase data due to 'polyphase sub-band filter'

Answer 133

Down sampling effect

Answer 134

1. Reduces Fs 2. While splitting signal in sub bands

Answer 135

All frames

Answer 136

Converts content into frequency domain data

Answer 137

Modified Discrete Cosine Transform

Answer 138

Half the data

Answer 139

Each sub-band

Answer 140

Signal to mask ratio

Answer 141

Signal to mask ratio

Answer 142

1. Masking level calculated for each sub-band 2. Calculation for SMR 3. Bit allocation to sub-bands 4. No. of bits assigned to sub-bands dependent of SMR 5. Bit depth varies across sub-bands due to content

Answer 143

1. Frames 2. Sub-bands 3. Down sampling 4. MDCT 5. Masking and bit allocation 6. Huffman coding

Answer 144

Statistical compression for further data reduction

Answer 145

Repeated sequences of data using shorter code eg 11010101 is stored as 01

Answer 146

1. Instructions for decoder 2. Samples in MDCT domain at reduced bit depth 3. Bit allocation data 4. Scale factor for each sub-band 5. encoded using Huffman coding

Answer 147

Inverse equivalent processes

Answer 148

An inverse MDCT

Answer 149

Data is combined

Answer 150

1. vary systematically with audio input 2. Vary according to encoding

Answer 151

Our understanding of good audio quality

Answer 152

1. Compression algorithms 2. Hardware systems 3. Network Codex

Answer 153

Listening tests taken by panel of listeners

Answer 154

Analysis of audio signals - based on observational phenomena

Answer 155

1. Most accurate results

Answer 156

1. Expensive 2. Time consuming 3. Subjective 4. Complex planning

Answer 157

1. Lower cost 2. Lower complexity 3. Consistent (no listeners) 4. Less time required

Answer 158

1.It is an estimation of human response

Answer 159

Original and processed signal

Answer 160

Better quality

Answer 161

We aren't sensitive to phase changes

Answer 162

Human hearing response to these parameters

Answer 163

Log-squared spectral distance

Answer 164

Large values for low power areas on spectrum

Answer 165

It is too sensitive for spectral changes which are inaudible

Answer 166

Spectral features which characterise signals

Answer 167

Cluster of energy around certain points in frequency

Answer 168

1. Speech 2. Musical instruments

Answer 169

Formant peaks

Answer 170

Symmetrical Kullback-Leibler Distance

Answer 171

Linear prediction coding

Answer 172

smooth formant based spectrum

Answer 173

Formant changes will be perceivable

Answer 174

1. High magnitudes 2. Low frequencies

Answer 175

High frequency shifts

Answer 176

Mel Frequency Cepstral Coefficients

Answer 177

How we hear sounds

Answer 178

Psychoacoustical phenomena

Answer 179

Inverse FFT of the log FFT of a signal (duh)

Answer 180

Pitch content

Answer 181

1. Cepstrum 2. Mel

Answer 182

Perceivable

Answer 183

Fm = nt * Frg nt = no. of teeth Frg = speed of gear

Answer 184

1. Compare signals with itself 2. Take FFT of results 3. Peaks will be produced at frequency of periodic elements

Answer 185

Environmental sound

Answer 186

Noise, Vibration, Harshness

Answer 187

Ford Mustang mic up engine and gives user option to change between sports and comfort mode (changing volume of 'engine')

Answer 188

1. Perceived quality 2. Purchase 3. Design and manufacture

Answer 189

1. When perception is affected by two or more senses 2. Louder = more powerful

Answer 190

1. Loudness 2. Roughness 3. Sharpness

Answer 191

Measure of energy across critical bands (Sone)

Answer 192

Rapid amplitude fluctuations by interacting sounds (Asper)

Answer 193

Weighting/shape of spectrum (Acum)

Answer 194

1. Roughness 2. Sharpness

Answer 195

In one critical band with concentrated high frequency energy

Answer 196

Category Scaling of Annoyance

Answer 197

Measuring annoyance of sound

Answer 198

CSA = 8.07 + ( 0.563 * N5 ) + ( 3.022 * S50 ) + ( 2.175 * R ) N = Loudness S = Sharpness R = Roughness

Answer 199

Music Information Retrieval

Answer 200

Variation of time and pitch in humming might not be recognised

Answer 201

Parsons code

Answer 202

Codes notes changes so that system recognises C, C#, D as tonic, up, up (sorry if that ones confusing)

Answer 203

1. Looks for closest match by extracting compact and descriptive set of acoustic features 2. Shazam

Answer 204

1. Database has millions of files so data must be compact 2. Fingerprints must be robust enough to ignore noise 3. Process must be efficient

Answer 205

1. Finds local maxima (peaks) 2. Encodes peaks as time and frequency coordinates

Answer 206

Use hashing process

Answer 207

1. Helps identify spectral features unique to music track 2. Speeds up process

Answer 208

1. Audio 2. Metadata 3. Symbolic Data

Answer 209

1. Easy to get ahold of 2. Can extract timber and acoustics easily

Answer 210

Hard to precisely identify some features

Answer 211

More detailed

Answer 212

1. No acoustic/timbre data 2. Difficult to represent whole song in MIDI

Answer 213

1. Timbre 2. Frequency 3. Intensity 4. Rhythmic features

Answer 214

1. Spectrogram 2. Frame-based approach

Answer 215

1. Brightness 2. Centroid 3. Flatness 4. Skewness

Answer 216

Change of spectra over time

Answer 217

1. Spectrogram 2. Pitch histograms

Answer 218

Calculate average energy for each note across spectrum

Answer 219

Classifies high level content using low level parameters

Answer 220

1. Get audio 2. Group 3. Find ground truths 4. Classify using ground truths

Answer 221

Typical to the genre

Answer 222

1. KNN 2. GMM 3. SUM

Answer 223

K Nearest Neighbour

Answer 224

KNN = square root of ( A - B ) ^2

Answer 225

Clearer boundaries

Answer 226

The better the class represents

Answer 227

Acoustical properties aren't taken into account so there might be similarities in acoustics rather than music

Answer 228

Glass Ceiling

Audio Analysis and Assessment Flashcards

(298 cards)