B2P3 - Digitisation and lossy compression Flashcards
a.c. coefficients
weights the basis functions at (1-7, 1-7) of a DCT table, the results evaluating to the amount of each spatial-frequency in the macroblock;
Adaptive Multi-Rate (AMR)
a speech coding standard that adapts its bit rate to channel conditions;
advanced audio coding (AAC)
MPEG-2/4 streamingFormat, with better quality than MP3 but with same bitRate;
aliasing
where samplingRate < 2*f;
analogue-to-digital converter (ADC)
takes an analogue signal, samples it, n-bit-quantises it, and returns a digital signal;
basis function
reps a specific hori-vert spatial-frequency, appearing as an entry in a DCT table;
B-frames
‘bi-directional prediction’ frames inserted between I-frames and P-frames filling in missing frames in the group of pictures (GOP) format;
bit rate
number of bits passing a given point in the network per unit time;
code-excited linear prediction (CELP)
= LPC - U/V_switch + codebook;
coding efficiency
minimising bitRate while holding videoQuality constant, or vice versa;
comfort noise (CN)
synthetic background noise for silent periods in voiceComms;
d.c. coefficient
weights the basis function at (0, 0) of a DCT table, the result evaluating to the average spatial-frequency of the whole macroblock;
dequantising
dequantise(n-bit_reps) = analogue;
difference error
pixel-subtracting one video frame from another;
difference pulse-code modulation (DPCM)
= PCM + predictionOfDataSamples (used in JPEG);
discontinuous transmission (DTX)
where mobile devices are powered-down when there is no voiceInput;
discrete cosine transformation (DCT)
DCT(spatialRep) = frequencyRep (used in J/MPEG);
discrete wavelet transform (DWT)
DWT(spatialRep) = frequencyRep (used in JPEG2000; better performance at low bitRates);
distortion
= reconstructedSignal - originalSignal;
encoding
encode(symbols) = bitStrings;
formants
the vocal tract’s spectral peaks characterising ‘voiced’ speechSegments;
frequency masking
where high-amplitude sounds cover up low-amplitude sounds at neighbouring frequencies;
group of pictures (GOP)
concatenating I-, P-, and B-frames for videoCompression (used in MPEG-1/2/4);
H.264 / advanced video coding (AVC)
an MPEG-4 videoCodingStreaming tech;
HE-AAC version 2
a tech optimised for low-bitRate audioStreaming apps including digi-radioBroadcasting;
I-frames
‘intra-frames’ are real reps of world (i.e. they are not fillers nor predictions);
inter-frame compression
stateful video compression;
intra-frame compression
stateless video compression;
inverse DCT
DCT_inv(frequencyRep) = spatialRep;
Joint Photographic Experts Group (JPEG)
an lossy compression standard for images using DCT;
linear predictive coding (LPC)
a sourceCoding method using parameters designed for speechSignals;
low frequency effect (LFE / subwoofer)
where the channel delivers bass-only information (<120 Hz);
macroblock
the formalParameter for the DCT() function, consisting of 8x8 pixels for JPEG and 16x16 pixels for MPEG;
mask-to-noise ratio (MNR)
determines bits per sub-band in MP3 using psychoacoustic model;
motion-JPEG (M-JPEG)
stateless video compression;
motion prediction
frame_n.macroblocks = frame_n-1.macroblocks + motionVectors;
motion vectors (MVs)
used in MPEG video coding to predict camera and object motion;
MPEG audio layer 3 (MP3)
a compressed audio file format;
noise masking
= frequency masking, temporal masking, and perceptual masking;
Ogg Vorbis (OV)
an open-source lossy audio compression format;
perceptual noise substitution (PNS)
where real noise is replaced with randomly generated noise;
perceptual redundancy
where sounds / images are irrelevant as humans do not perceive them;
P-frames
‘prediction-frames’ do not rep the world and instead are an educated guess based on the previous I-frame;
prediction (residual) error
= P-frame(t)* - I-frame(t);
*based on I-frame(t-1);
pulse code modulation (PCM)
PCM(analogueSignal) = encoding ( quantising ( sampling ( analogueSignal ) ) ) = digitalSignal;
quality of service (QoS)
QoS(network) = minThoughput / maxLatency / lowestBER;
quantisation noise (quantisation error)
= analogueSignal - digitalSignal;
quantisation (quantising)
n-bit-quantise(sample(analogueSignal)) = digitalSignal;
rate-distortion (RD) curves
bitsPerSymbol plotted against distortionError;
region-of-interest (ROI) coding
where different areas of an image are coded at different bitRates (used in JPEG2000);
resolution
bitsPerSymbol;
sampling
measuring an analogueSignal at regular intervals;
sampling theorem
IF (samplingRate >= 2*f) THEN (signal can be reconstructed);
scalabililty (subsets)
where video/audio compression has subsets of varying bitsPerSymbol;
statistical redundancy
where repeating patterns are identified and repped with a symbol;
temporal masking
where a high-amplitude sound covers up a subsequent low-amplitude sound;
thresholding
setting a constant such that any values lower than this can be ignored;
unvoiced sounds
where there are no formants; can be repped by Gaussian noise;
vocoder
portmanteau of ‘voice’ and ‘encoder’; compresses speech;
voiced sounds
where there are formants;