Information theory Flashcards
What is the relationship between a probability of an event and the amount of information it contains?
A certain event contains zero information.
An event with zero probability contains infinite information
IA = -log PA
IA is the information in event A, PA is the probability
How many bits do you need for 4, 8 equiprobable messages?
4 - requires 4 distinct binary pulse patterns (2 bits)
8 - requires 8 distinct binary pulse patterns (3 bits)
How can you calculate the information contained by message of different symbols with different independent probabilities
Ii = log_2(1/Pi) bits
Say you had message X = ABC, sum log_2(1/Pi) of A B and C
What is the entropy?
Entropy H is the average information. H = sum(i=1 to n) of Pi*log_2(1/Pi)
What is the rate of information transmission?
R = rH bits per second
r is the symbol rate
H is the entropy (average information per symbol)
What is Shannon’s theorem?
Says that R<=C, ie. rate of information transmission is less than channel capacity.
C is the max rate of reliable transmission through the channel without error.
What is the Hartley-Shannon theorem?
C = Blog_2(1+S/N) bits per second
C = channel capacity
B = channel bandwidth
S/N = mean square signal to noise ratio. Needs to be not in dB. dB = 10 log_10(x)
What is the maximum error rate for a binary system?
50%
How does an extended source work?
If you consider a message of individual symbols to be divided into blocks of n symbols, you can consider this a source of K^n alphabet symbols where K is the number of symbols in the original alphabet. Then Hnew = nH
How can data be source encoded?
Source encoding means representing information efficiently. This can be done by representing frequent symbols with short codewords and rare symbols with longer codewords
What are the two requirements of source encoders?
The codewords are in binary form
The source code is uniquely decodable.
How can we calculate the coding efficiency of a source encoder?
ƞ = Lmin/Lavg
L is codeword length
Lavg is calculated by the sum of (probabilities times the number of bits in each codeword)
What is a prefix code?
A code in which no codeword is the prefix of any other codeword. It is always uniquely decodable and has a decision tree as you receive more and more bits. The end of the codeword is always recognisable. H <= Lavg < H+1
What is Huffman coding?
It is a procedure for creating prefix codes that assigns longer codewords to symbols with more information. It however requires statistical knowledge of the source.
What is Lempel-Zev coding?
Takes care of the fact that you might not know the statistics of the data you are encoding.
Uses fixed length codes to represent a variable number of source symbols.
Only has true advantage with long data strings.