Typicality Flashcards
Chebyshev’s inequality
Let X be a random variable with finite expected value μ and finite non-zero variance σ^2. Then for any real number k > 0
Pr(|X − μ| ≥ k) ≤ σ^2 / k^2
Weak law of large numbers
Let X_1, … , X_n, be i.i.d. Then, the weak law of large numbers basically says that the sample average converges in probability towards the expected value
Weak asymptotic equipartition property
Let X^n = X_1, … , X_n be a sequence of independent elements, where X_i ∼ p(x). Then, in probability,
lim_{n→∞} −1/n log p(X^n) = H(X).
Alternatively, for any ε > 0, and n sufficiently large
Pr( | -1/n log(p(X^n)) - H(X) | ≤ ε) > 1-ε.
Weak typicality for discrete sources
It is the set where the sample entropy is close to the true entropy. It is the smallest set (to first order in the exponent) that contains most of the probability.
ε-typical sequences, A_ε^n, with respect to p(x) is the set of sequences x^n = (x_1, …, x_n) such that, for ε > 0,
|-1/n log(p(x^n)) - H(X)| ≤ ε.
Alternatively
H(X) - ε ≤ -1/n log(p(x^n)) ≤ H(X) + ε.
Properties of weak typicality for discrete sources
If x^n ∈ A_ε^n,
2^{-n(H(X) + ε) ≤ p(x^n) ≤ 2^{-n(H(X) - ε).
For n sufficiently large,
Pr( X^n ∈ A_ε^n) > 1-ε.
For n sufficiently large,
(1-ε) * 2^{n(H(X) - ε) ≤ |A_ε^n| ≤ 2^{n(H(X) + ε).
In general, the most likely sequence ist not weakly typical and therefore very unlikely to be chosen, since its empirical entropy is not close to the true entropy.
Strong typicality
Denote the strongly typical set by A_ε^{*n}
A sequence x^n ∈ X^n is said to be ε-strongly typical with respect to a distribution p(x) on X if:
1. For all a ∈ X with p(a) > 0, we have
|1/n N(a|x^n) − p(a)| < ε /|X|
- For all a ∈ X with p(a) = 0, N(a|x^n) = 0.
N(a|x^n) is the number of occurrences of the symbol a in the sequence x^n.
Joint strong typicality
A pair of sequences (x^n, y^n) ∈ X^n × Y^N is said to be ε-strongly typical with respect to a distribution p(x, y) on X × Y if:
- For all (a, b) ∈ X × Y with p(a, b) > 0, we have
1/n N(a, b|x^n, y^n) − p(a, b)| < ε /|X|*|Y| - For all (a, b) ∈ X × Y with p(a, b) = 0, N(a, b|x^n, y^n) = 0.
N(a, b|x^n, y^n) is the number of occurrences of the pair (a, b) in the pair of sequences (x^n, y^n).
Joint strong typicality properties
If (x^n, y^n) ∈ A_ε^{*n},
2^{-n( H(X,Y) + λ) ≤ p(x^n, y^n) ≤ 2^{-n(H(X,Y) - λ).
For n sufficiently large,
Pr( (x^n, y^n) ∈ A_ε^{*n}) > 1-ε.
For n sufficiently large,
(1-ε) * 2^{n(H(X,Y) - λ) ≤ |A_ε^n| ≤ 2^{n(H(X,Y) + λ).
There are roughly 2^{n * H(X,Y)} joint strongly typical pairs.
Joint typicality
The set A_ε^n of jointly typical sequences {(x^n, y^n )} with respect to the distribution p(x, y) is the set of n-sequences with empirical entropies ε-close to the true entropies:
A_ε^n = { (x^n, y^n) ∈ X^n × Y^n :
|-1/n log[ p(x^n) - H(X)| < ε
|-1/n log[ p(y^n) - H(Y)| < ε
|-1/n log[ p(x^n, y^n) - H(X,Y)| < ε}
where
p(x^n, y^n) = ∏_{i=1}^n p(x_i, y_i).
Properties of joint typicality
Pr((X^n, Y^n) ∈ A_ε^n → 1 as n → ∞.
|A_ε^n| ≤ 2^{n( H(X,Y ) + ε)}
We can consider about 2^{n * I (X;Y)} randomly chosen pairs of individually typical sequences before we are likely to come across a jointly typical pair.
Conditional typicality
If (˜X^n, ˜Y^n) ∼ p(x^n) p(y^n) [i.e., ˜X^n and ˜Y^n are independent with the same marginals as p(x n , y n )], then
Pr((˜X^n , ˜Y^n) ∈ A_ε^n) ≤ 2−n( I (X;Y ) − 3ε).
Also, for sufficiently large n,
Pr((˜X^n, ˜Y^n) ∈ A_ε^n) ≥ (1 − ε) * 2−n(I (X;Y ) + 3ε) .
Weak typicality for continuous sources
Let X_1, … , X_n be a sequence of random variables drawn i.i.d. according to the density f(x). Then, in probablity,
−1/n log f (X^n) → E[−log f(X^n)] = h(X^n).
For ε > 0, and any n, we define the typical set as A_ε^n with respect to f(x) as follows:
A_ε^n = { (x_1, …, x_n) ∈ R^n : |-1/n log(f(x^n)) -h(X)| ≤ ε},
where f(x^n) = ∏f(x_i).
Properties of weak typicality for continuous sources
Pr(A_ε^n) > 1-ε for n sufficiently large
Vol(A_ε^n) ≤ 2^{n(h(X) + ε) for all n
Vol(A_ε^n) ≥ (1-ε) * 2^{n(h(X) - ε) for n sufficiently large
The set A_ε^n is the smallest volume set with probability ≥ 1 − ε, to first order in the exponent.
Empirical entropy
The empirical entropy of any typical sequence is close to the true entropy H(X).
-1/n log(p(x^n)) = -1/n \sum_{i=1}^n log(p(x_i)).