Rate Distortion Theory Flashcards
What is the aim of rate distortion theory?
Minimize the data rate (bit-rate) subject to a distortion constraint (quality level).
Or, minimize the distortion (i.e., maximize the quality) subject to a data rate constraint (maximum bit-rate/bandwidth)
Distortion
The distortion is a measure of the dissimilarity between the original signal X and its reconstruction Y. Typically the distortion only depends upon the difference between the input signal and its reconstruction.
The most common distortion measure is the squared error
d(X,Y) = 1/N \sum_{i=1}^n (X_i - Y_i)^2
This is not always consistent with what humans perceive as quality, so it is important to use a distortion measure that targets the application.
Distortion function
A distortion function or distortion measure is a mapping
d: X × ˆX → R^+
from the set of source alphabet-reproduction alphabet pairs into the set of non-negative real numbers. The distortion d(x, ˆx) is a measure of the cost of representing the symbol x by the symbol ˆx.
Hamming distortion: d(x, ˆx) =
0 if x = ˆx
1 if x ≠ ˆx
Squared error distortion: d(x, ˆx) =
(x - ˆx)^2
Distortion: d(x^n, ˆx^n) =
1/n \sum_{i=1}^n d(x_i, ˆx_i)
Rate distortion code
A (2^{nR} , n)-rate distortion code consists of an encoding function,
f_n : X^n → {1, 2, … , 2^{nR} }
and a decoding (reproduction) function,
gn : {1, 2, … , 2^{nR}} → ˆX^n
The distortion associated with the (2^{nR}, n) code is defined as
D = Ed(X^n , g_n [f_n (X^n) ] ),
The set of n-tuples g_n(1), g_n(2), … , g_n(2^{nR}), denoted by ˆX_n(1), … ,ˆX_n(2^{nR}), constitutes the codebook, and f_n^{−1} (1), … , f_n^{-1} (2^{nR}) are the associated assignment regions.
Achieveable rate distortion pair
A rate distortion pair (R, D) is said to be achievable if there exists a sequence of (2^{nR}, n)-rate distortion codes (f_n , g_n) with lim_{n→∞} E d(X^n , g_n(f_n(X^n) ) ) ≤ D.
Rate distortion region
The rate distortion region for a source is the closure of the set of achievable rate distortion pairs (R, D).
Rate distortion function
The rate distortion function R(D) is the infimum of rates R such that (R, D) is in the rate distortion region of the source for a given distortion D. We cannot achieve a distortion of less than D if we describe X at a rate less than R(D).
It is a nonincreasing convex function of D.
Distortion rate function
The distortion rate function D(R) is the infimum of all distortions D such that (R, D) is in the rate distortion region of the source for a given rate R.
Information rate distortion function
The information rate distortion function R^I (D) for a source X with distortion measure d(x, ˆx) is defined as
R^I (D) = min_{ p( ˆx|x): \sum_{(x, ˆx)} p(x) * p( ˆx|x) * d(x, ˆx) ≤ D} I(X; ˆX),
where the minimization is over all conditional distributions p( ˆx|x) for which the joint distribution p(x, ˆx) = p(x) * p( ˆx|x) satisfies the expected distortion constraint.
The information rate distortion function is equal to the rate distortion function,
R(D) = R^I (D)
Distortion typical
Let p(x, ˆx) be a joint probability distribution on X × ˆX and let d(x, ˆx) be a distortion measure on X × ˆX. For any ε > 0, a pair of sequences (x^n, ˆx^n ) is said to be distortion ε-typical or simply distortion typical if
|-1/n log[ p(x^n)] − H (X)|< ε
|-1/n log [p(ˆx^n)] − H ( ˆX) < ε
|-1/n log [p(x^n, ˆx^n)] − H(X, ˆX) < ε
|d(x^n, ˆx^n) − Ed(X, ˆX)| < ε.
The set of distortion typical sequences is called the distortion typical set and is denoted A_{d, ε}^n.
It is a subset of the jointly typical set.
Practical source coding
For lossless coding we would either design an entropy coder (Huffman coder, Arithmetic coder, etc.) for low dimensional sources, or rely upon AEP principles for high dimensional sources. Similar, for lossy source coding, we would use quantization followed by entropy coding for low dimensional sources.
Rate distortion function of Bernoulli with Hamming distortion
R(D) = H(p) - H(D)
Rate distortion function of Gaussian source with squared-error distortion
R(D) = 1/2 log([σ^2] / D)