Week 6 Flashcards
What is Probability Distribution?
It is a function that maps events in some universe U to the range 0 to 1 inclusive. The sum of all probabilities is 1.
What is a Uniform Distribution?
All elements have equal likelihood of being sampled. The probability of any one item being sampled is 1/|U| –> 1 out of the total number of elements.
What is Point Distribution?
You alwatys get a specific x’ when sampling, so the probability of x’ is 1, and the probability of NOT x’ is 0.
An example is if you had a deck of cards made up of only kings. The probability that a king is sampled is 1, and not a king is 0.
What type of distribution is this?
The probability that event A happens is equal to the size of elements in A divided by the size of elements in U.
Uniform Distribution
Why is it that the probabability of an event A happening is equal to the size of elements in A divided by the size of the elements in U?
Because we basically take the number of elements we have, A, multiply it by tyhe probability of an event A happening, which is 1/|U|. and we get A/|U|.
Say again we are sampling a 4-bit binary string. Event A is if we sample a string with the LSB being 1. We also say that each string is equally likely to be sampled.
What is the probability of event A happening?
We know that 8 strings have the LSB of 1, so we get 8/16 or ½.
What is Union Bound?
A formula that gives us a way to bound the probability that either one event A1 or another event A2 will happen.
EXAMPLE:
Assume that you want to see if a password is going to be cracked. There are two ways that this could happen. Event A1 is the probability that the password is weak, and event A2 is the probability that the cryptography is broken.
If either event occurs, then there is an issue. Maybe the events are related too. Maybe if cryptography is broken, A1 becomes easier.
So, we give an upper bound by adding the two probabilities together, and when we can get the upper bound down to a reasonable, small probability in our system, then we know that whatever actual complex attempt someone has is bounded by a good number.
When do we know if two events A1 and A2 are disjoint?
The probability of their intersection is 0.
What is a Random Variable?
These are a function X that maps elements from some space U to some set V. These serve as a way to model things that are based on chance or non-determinism.
EXAMPLE:
Say you are writing a program to encrypt something with a random number. The random number can be described using a random variable.
What are Uniform Random Variables?
Random Variables that are uniformly sampled. We use little R symbol on the arrow (as shown here) to indicate this.
These can be viewed as the identity function: X(y) = y for all y in U, the probability that x = y is 1/|U|. IT will always be 1/|U|.
What does the formula for a Deterministic Algorithm look like?
How about a Randomized Algorithm?
What does it mean to say that two sets A and B are independenty?
It means that they do not affect each other. If A and B are independent, then:
Pr[A intersect B] = Pr[A] * Pr[B]
Essentially, it’s the probability that both A and B happen. This is NOT the same as disjoint.
What is Entropy?
The entropy of a variable is the amount of information in a variable.
Entropy is a lower limit, which is different than just the raw data like the number of bits we need to store information of a variable (that is the upper limit).
Consider a coint flip that is biased. Say we know the flip will always be heads. You don’t need sany bits to store this information because if it’s always heads, you can easily just always know it will be heads. HOWEVER, with a fair coin, you need to transmit a bit of information to let someone know what the results will be. There is only a 50% chance of guessing the outcome correctly.
For rolling a die, we need to transmit more bits of information because there is a probability of 1/6 for each outcome.
What is Shannon’s Entropy?
The absolute minimum amount of storage/transmission needed to capture the information in some variable.
Why do we care about entropy in security?
It is a good measure of how random a variable is. High entropy means higher randomness, which means a password is even harder to guess.
A coin flip has low entropy, it’s easy to guess.
For a die roll, guessing is comparatively harder (higher entropy).
Why is entropy of passwords not as high as it could be?
People typically don’t pick completely random passwords.
What happens when passwords are simpler, more standard ones, like “password”?
The intuition is that if a variable is more likely to be a certain value and it’s more likely to guess it without looking at it, the password variable has less information in it and takes fewer bits to transmit.
This means there is less entropy and it’s not a great password.
What is the specific formula for Shannon’s Entropy Equation?
Say we have H(X) where X is some random variable, and it is over all the possible outcomes in the universe. So for each x, we have the probability that event x occurs * log2(1/Pr[x]).
So H(X) is the sum of all the probabilities.
Then we have 1/Pr[x] which is the amount of information in each case (for each x).
In Shannon’s Entropy Equation, why do we care about the reciprocal of 1/Pr[x]?
The information is the reciprocal of the probability because of the following. Consider the example below:
Imagine we have a fair coin flip and we get heads or tails. If the result is heads, we can eliminate the uncertainty of two total events (uncertainty of heads or tails). In the case of our coin flip if the probability of x is ½, and the reciprocal of that is 2.
Example of using Shannon’s Entropy Equation.
Assume that a fair coin is flipped and has a uniform distribution over the universe of heads and tails.
We have some random variable x and the probability of x being heads or tails is ½.
So, with this information, we look at the Shannon entropy calculation shown below.
We sum both events, and the outcome is 1, which is expected since this is a uniform distribution, so we can’t make it easier to guess on a fair coin flip. It will always be 50/50
What would happen if we considered a universe with a biased coin?
We have the same universe of heads and tails, but…
Pr[x=H] = 1/2 + e
where e is some value, and…
Pr[x=T] = 1/2 - e
Then we just plug these probabilities into Shannon’s equation.