Typicality Flashcards

1
Q

Chebyshev’s inequality

A

Let X be a random variable with finite expected value μ and finite non-zero variance σ^2. Then for any real number k > 0

Pr(|X − μ| ≥ k) ≤ σ^2 / k^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Weak law of large numbers

A

Let X_1, … , X_n, be i.i.d. Then, the weak law of large numbers basically says that the sample average converges in probability towards the expected value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Weak asymptotic equipartition property

A

Let X^n = X_1, … , X_n be a sequence of independent elements, where X_i ∼ p(x). Then, in probability,

lim_{n→∞} −1/n log p(X^n) = H(X).

Alternatively, for any ε > 0, and n sufficiently large
Pr( | -1/n log(p(X^n)) - H(X) | ≤ ε) > 1-ε.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Weak typicality for discrete sources

A

It is the set where the sample entropy is close to the true entropy. It is the smallest set (to first order in the exponent) that contains most of the probability.

ε-typical sequences, A_ε^n, with respect to p(x) is the set of sequences x^n = (x_1, …, x_n) such that, for ε > 0,
|-1/n log(p(x^n)) - H(X)| ≤ ε.
Alternatively
H(X) - ε ≤ -1/n log(p(x^n)) ≤ H(X) + ε.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Properties of weak typicality for discrete sources

A

If x^n ∈ A_ε^n,
2^{-n(H(X) + ε) ≤ p(x^n) ≤ 2^{-n(H(X) - ε).

For n sufficiently large,
Pr( X^n ∈ A_ε^n) > 1-ε.

For n sufficiently large,
(1-ε) * 2^{n(H(X) - ε) ≤ |A_ε^n| ≤ 2^{n(H(X) + ε).

In general, the most likely sequence ist not weakly typical and therefore very unlikely to be chosen, since its empirical entropy is not close to the true entropy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Strong typicality

A

Denote the strongly typical set by A_ε^{*n}
A sequence x^n ∈ X^n is said to be ε-strongly typical with respect to a distribution p(x) on X if:
1. For all a ∈ X with p(a) > 0, we have
|1/n N(a|x^n) − p(a)| < ε /|X|

  1. For all a ∈ X with p(a) = 0, N(a|x^n) = 0.
    N(a|x^n) is the number of occurrences of the symbol a in the sequence x^n.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Joint strong typicality

A

A pair of sequences (x^n, y^n) ∈ X^n × Y^N is said to be ε-strongly typical with respect to a distribution p(x, y) on X × Y if:

  1. For all (a, b) ∈ X × Y with p(a, b) > 0, we have
    1/n N(a, b|x^n, y^n) − p(a, b)| < ε /|X|*|Y|
  2. For all (a, b) ∈ X × Y with p(a, b) = 0, N(a, b|x^n, y^n) = 0.
    N(a, b|x^n, y^n) is the number of occurrences of the pair (a, b) in the pair of sequences (x^n, y^n).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Joint strong typicality properties

A

If (x^n, y^n) ∈ A_ε^{*n},
2^{-n( H(X,Y) + λ) ≤ p(x^n, y^n) ≤ 2^{-n(H(X,Y) - λ).

For n sufficiently large,
Pr( (x^n, y^n) ∈ A_ε^{*n}) > 1-ε.

For n sufficiently large,
(1-ε) * 2^{n(H(X,Y) - λ) ≤ |A_ε^n| ≤ 2^{n(H(X,Y) + λ).

There are roughly 2^{n * H(X,Y)} joint strongly typical pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Joint typicality

A

The set A_ε^n of jointly typical sequences {(x^n, y^n )} with respect to the distribution p(x, y) is the set of n-sequences with empirical entropies ε-close to the true entropies:
A_ε^n = { (x^n, y^n) ∈ X^n × Y^n :
|-1/n log[ p(x^n) - H(X)| < ε
|-1/n log[ p(y^n) - H(Y)| < ε
|-1/n log[ p(x^n, y^n) - H(X,Y)| < ε}

where
p(x^n, y^n) = ∏_{i=1}^n p(x_i, y_i).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Properties of joint typicality

A

Pr((X^n, Y^n) ∈ A_ε^n → 1 as n → ∞.
|A_ε^n| ≤ 2^{n( H(X,Y ) + ε)}

We can consider about 2^{n * I (X;Y)} randomly chosen pairs of individually typical sequences before we are likely to come across a jointly typical pair.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Conditional typicality

A

If (˜X^n, ˜Y^n) ∼ p(x^n) p(y^n) [i.e., ˜X^n and ˜Y^n are independent with the same marginals as p(x n , y n )], then

Pr((˜X^n , ˜Y^n) ∈ A_ε^n) ≤ 2−n( I (X;Y ) − 3ε).

Also, for sufficiently large n,
Pr((˜X^n, ˜Y^n) ∈ A_ε^n) ≥ (1 − ε) * 2−n(I (X;Y ) + 3ε) .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Weak typicality for continuous sources

A

Let X_1, … , X_n be a sequence of random variables drawn i.i.d. according to the density f(x). Then, in probablity,
−1/n log f (X^n) → E[−log f(X^n)] = h(X^n).

For ε > 0, and any n, we define the typical set as A_ε^n with respect to f(x) as follows:
A_ε^n = { (x_1, …, x_n) ∈ R^n : |-1/n log(f(x^n)) -h(X)| ≤ ε},
where f(x^n) = ∏f(x_i).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Properties of weak typicality for continuous sources

A

Pr(A_ε^n) > 1-ε for n sufficiently large
Vol(A_ε^n) ≤ 2^{n(h(X) + ε) for all n
Vol(A_ε^n) ≥ (1-ε) * 2^{n(h(X) - ε) for n sufficiently large
The set A_ε^n is the smallest volume set with probability ≥ 1 − ε, to first order in the exponent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Empirical entropy

A

The empirical entropy of any typical sequence is close to the true entropy H(X).
-1/n log(p(x^n)) = -1/n \sum_{i=1}^n log(p(x_i)).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly