Exam Prep Flashcards
Explain what capacity implies in practice for an error correction code
The capacity is the highest rate of the error correction code whilst maintaining errorless decoding
Capacity
max_pmf I(X;Y)
Conditional Entropy Inequality
“Information Can’t hurt”
H(X | Y) <= H(X)
Chain Rule of Entropy
H(X, Y) = H(X) + H(Y|X) = H(Y) + H(X|Y)
Conditional Entropy
H(Y|X) = SUM_x p(x) H(Y|X=x)
= - SUM_x SUM_y p(x) p(y|x) log p(y|x)
Chain Rule of Conditional Entropy
H(X, Y|Z) = H(Y|Z) + H(X|Y, Z) = H(X|Z) + H(Y|X, Z)
Joint Entropy
H(X, Y) = - SUM_xy p(x,y) log p(x,y)
Relative Entropy (Kullbach-Leiber Distance)
D(p||r) = SUM_x p(x) log p(x)/r(x)
Bayes Theorem
P(B|A) = P(A|B) × P(B)/ P(A)
Conditional Probability
P(B|A) = P(A ∩ B) / P(A)
Total Probability
if {A1,A2, . . . ,Ak } is a partition of Ω,
then for any B ⊂ Ω
P(B) = SUM(i=1 -> k) P(B|A_i)P(A_i)
Mutual Information
I(X;Y) = = H(Y) − H(Y|X)
Entropy
H(X) = - SUM_x p(x) log p(x)
Two key properties of Entropy (value)
Always nonnegative
Only depends on the probabilities in the pmf of the RV
Binomial pmf X ∼ Bi(n, q)
X ∼ Bi(n, q)
p(k) = (nCk) (q^k)(1-q)^n-k for k = 0,1,2,…,n
Joint Distribution
Pr(X = x, Y = y) = p(x, y)
Conditional Distribution
p(y|x) = p(x, y)/ p(x)
Marginalisation
p(x) = SUM_y p(x,y)