Information Theory 1 Flashcards

1
Q

What is the fundamental problem of communication introduced by Shannon? (communication via analogue phones)
According to Shannon, what was irrelevant to the engineering problem in communication?

A

-that you have to reproduce at one point either exactly or approximately a message selected at another point
-semantic aspects of communication/words because semantics of communication are vary between individuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is surprise?

A

a measure of information for a specific outcome (eg heads/tails)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If biased coin, 80% heads, what is the surprise and information like for both outcomes?
heads
tails

A

heads: low surprise, low information
tails: high, high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the surprise and information like for the outcomes of a coin with heads on both sides?

A

only heads as an outcome
zero surprise, zero information
(no information transfer)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the equation to calculate information/surprise using probability?

A

I = -log2(probability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If the probability of a biased coins for heads is P(heads)=0.8, what is the surprise I(heads)= ?
What is the units for this ^?

A

I(heads)= -log2(0.8) = 0.32
bits ??

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is entropy?
Units?

A

-entropy is average surprise/ the weighted average information that is transferred
bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you calculate entropy?

A

H(C)= summation of: probability timse the surprise for each outcome

eg.
H(C) = 0.8 · 0.32 + 0.2 · 2.32 = 0.72

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the curve look like on a graph with probability on the x axis and the entropy on the y axis?
Why?

A

parabola (positive like a hill)
because when the probability is equal, the entropy peaks, this is because is more random than when using a biased coin. (because you can predict less but biased coin you can predict more/less surprise)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the units of surprise and entropy ?

A

-bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

To calculate surprise or entropy in Information Theory, why do we use log base 2?

A

because the outcomes are binary (only two possible options)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Generally as entropy increases, what happens to surprise?

A

surprise increase as the system is more random/uncertain and less predictable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the second-order Markov character models in the English language?
What is Zipf’s Law?

A

-prediction of certain two letter to be next to each other eg. q is zero apart from with u
-in each language, different words are used more or less frequently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are LLMs?

A

large language models: like ChatGPT, these use billions of parameters and all written text available to generate convincing text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does Information Theory allow us to do?

A

quantify the amount of information transmitted in a channel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly