Lecture 2: Shannon Codes Flashcards
Easy way to choose code lengths to meet Kraft inequality?
If the probability distribution has all probabilities in the form of 2^(-lk), ie, if symbol 1 is 2^(-2) likely (1/4), it should be given a codeword length of 2.
At the end, it is complete if all the probabilities add up to 1.
How to deal with probabilities that are not negative integer powers (2^-n)? What does this mean for compression?
Ceiling round: length lk = Ceiling(log1 / pk)
Easiest (most efficient) to compress if codeword probabilities are 1/2, 1/4 etc… (log1/pk are ints)
If we don’t have nice probabilities, we will have to lower compression to ceiling round the bits per symbol.
What is a Shannon code?
A prefix code where each code word length is computed as the ceiling of log(1/pk), the log of 1 over the probability of it occurring.
What are the compression bounds for a Shannon code?
H(X) <= L(C,X) <= H(X) + 1 (Expected Length is within 1 bit of Shannon entropy)
What is the expected length of a code?
L(C,X) = Sum(pklk) = Sum(pkCeiling(log2(1/pk)))
How do you prove compression bounds of Shannon code?
FILL IN!
Is the Shannon code optimal?
No, if the codewords have non-integer shannon information contents (log1/pk), compression optimality is lot.
What is the source coding theorem?
FILL IN
How can we reduce the upper bound compression limit on Shannon codes?
By block encoding the codewords, we can reduce the extra bit of information (from the shannon entropy) across many letters, towards the limit of H(X).
For n input symbols in a block, H(X) <= L(C, Xn) <= H(X) + 1/n
What to do if the probability distribution was wrong (was a guess)?
Run the numbers with both the incorrect and correct distributions to get L(C,X) for both. D(p || q) will be the difference between them. Can also do Sum(log2(P(x)*(P(x)/Q(x))
How to prove the Difference between two probability distributions is H(x) + D(p||q)?`
FILL IN