Information Theory 1 Flashcards

Question 1

Q

What is the fundamental problem of communication introduced by Shannon? (communication via analogue phones)
According to Shannon, what was irrelevant to the engineering problem in communication?

Answer

A

-that you have to reproduce at one point either exactly or approximately a message selected at another point
-semantic aspects of communication/words because semantics of communication are vary between individuals

Question 2

Q

What is surprise?

Answer

A

a measure of information for a specific outcome (eg heads/tails)

Question 3

Q

If biased coin, 80% heads, what is the surprise and information like for both outcomes?
heads
tails

Answer

A

heads: low surprise, low information
tails: high, high

Question 4

Q

What is the surprise and information like for the outcomes of a coin with heads on both sides?

Answer

A

only heads as an outcome
zero surprise, zero information
(no information transfer)

Question 5

Q

What is the equation to calculate information/surprise using probability?

Answer

A

I = -log2(probability)

Question 6

Q

If the probability of a biased coins for heads is P(heads)=0.8, what is the surprise I(heads)= ?
What is the units for this ^?

Answer

A

I(heads)= -log2(0.8) = 0.32
bits ??

Question 7

Q

What is entropy?
Units?

Answer

A

-entropy is average surprise/ the weighted average information that is transferred
bits

Question 8

Q

How do you calculate entropy?

Answer

A

H(C)= summation of: probability timse the surprise for each outcome

eg.
H(C) = 0.8 · 0.32 + 0.2 · 2.32 = 0.72

Question 9

Q

What does the curve look like on a graph with probability on the x axis and the entropy on the y axis?
Why?

Answer

A

parabola (positive like a hill)
because when the probability is equal, the entropy peaks, this is because is more random than when using a biased coin. (because you can predict less but biased coin you can predict more/less surprise)

Question 10

Q

What is the units of surprise and entropy ?

Question 11

Q

To calculate surprise or entropy in Information Theory, why do we use log base 2?

Answer

A

because the outcomes are binary (only two possible options)

Question 12

Q

Generally as entropy increases, what happens to surprise?

Answer

A

surprise increase as the system is more random/uncertain and less predictable

Question 13

Q

What is the second-order Markov character models in the English language?
What is Zipf’s Law?

Answer

A

-prediction of certain two letter to be next to each other eg. q is zero apart from with u
-in each language, different words are used more or less frequently

Question 14

Q

What are LLMs?

Answer

A

large language models: like ChatGPT, these use billions of parameters and all written text available to generate convincing text

Question 15

Q

What does Information Theory allow us to do?

Answer

A

quantify the amount of information transmitted in a channel