Lec 2 | Uncertainty Flashcards
This can be represented as a number of events and the likelihood, or probability, of each of them happening.
Uncertainty
Every possible situation can be thought of as a world, represented by which lowercase Greek letter?
omega ω
How do we represent the probability of a certain world?
we write P(ω)
Axioms in Probability
every value representing probability must range between 0 and 1
0 < P(ω) < 1
Axioms in Probability
________ is an impossible event, like rolling a standard die and getting a 7.
Zero or 0
Axioms in Probability
________ is an event that is certain to happen, like rolling a standard die and getting a value less than 10.
One or 1
Axioms in Probability
The probabilities of every possible event, when summed together, are equal to ?
1
the degree of belief in a proposition in the absence of any other evidence
Unconditional Probability
The degree of belief in a proposition given some evidence that has already been revealed.
Conditional Probability
AI can use partial information to make educated guesses about the future. To use this information, which affects the probability that the event occurs in the future, we rely on?
Conditional Probability
How do we express conditional probability?
P(a | b)
What does P(a | b) mean?
“the probability of event a occurring given that we know event b to have occurred” or “the probability of a given b.”
What formula do we use to compute the conditional probability of a given b?
P(a∧b) P(a|b)=------------- P(b)
It is a variable in probability theory with a domain of possible values that it can take on
Random Variable
It is the knowledge that the occurrence of one event does not affect the probability of the other event.
Independence
How do we define independence?
Independece can be defined mathematically: events a and b are independent if and only if the probability of a and b is equal to the probability of a times the probability of b: P(a ∧ b) = P(a)P(b)
It is commonly used in probability theory to compute conditional probability
Baye’s Rule
Bayes’ rule says that the probability of b given a is equal to the probability of a given b, times the probability of b, divided by the probability of a.
P (b) P(a | b) P(b | a) = ------------------ P(a)
It is the likelihood of multiple events all occurring
Joint Probability
Probability Rules
This stems from the fact that the sum of the probabilities of all the possible worlds is 1, and the complementary literals a and ¬a include all the possible worlds.
Negation: P(¬a) = 1 - P(a).
Probability Rules
This can interpreted in the following way: the worlds in which a or b are true are equal to all the worlds where a is true, plus the worlds where b is true. However, in this case, some worlds are counted twice (the worlds where both a and b are true)). To get rid of this overlap, we subtract once the worlds where both a and b are true (since they were counted twice).
Inclusion-Exclusion: P(a ∨ b) = P(a) + P(b) - P(a ∧ b).
Probability Rules
The idea here is that b and ¬b are disjoint probabilities. That is, the probability of b and ¬b occurring at the same time is 0. We also know b and ¬b sum up to 1. Thus, when a happens, b can either happen or not. When we take the probability of both a and b happening in addition to the probability of a and ¬b, we end up with simply the probability of a.
Marginalization: P(a) = P(a, b) + P(a, ¬b).
Probability Rules
How is Marginalization expressed for random variables?
P(X = xsubi) = ∑subjP(X = xsubi, Y = ysubj)
yati unsaon ni
Probability Rules
This is a similar idea to marginalization. The probability of event a occurring is equal to the probability of a given b times the probability of b, plus the probability of a given ¬b time the probability of ¬b.
Conditioning: P(a) = P(a | b)P(b) + P(a | ¬b)P(¬b).
It is a data structure that represents the dependencies among random variables.
Bayesian network
What are the Bayesian networks properties?
- They are directed graphs.
- Each node on the graph represent a random variable.
- An arrow from X to Y represents that X is a parent of Y. That is, the probability distribution of Y depends on the value of X.
- Each node X has probability distribution P(X | Parents(X)).
What are the properties of Inference?
Query X, Evidence Variable E, Hidden Variable Y, The goal
Inference
The variable for which we want to compute the probability distribution.
Query X
Inference
one or more variables that have been observed for event e
Evidence Variables E
Inference
variables that aren’t the query and also haven’t been observed.
Hidden variables Y
Inference
What is the goal?
calculate P(X | e)
It is a process of finding the probability distribution of variable X given observed evidence e and some hidden variables Y.
Inference by Enumeration
It is one technique of approximate inference
Sampling
What are the steps of Likelihood Weighting?
- Start by fixing the values for evidence variables.
- Sample the non-evidence variables using conditional probabilities in the Bayesian network.
- Weight each sample by its likelihood: the probability of all the evidence occurring.
It is an assumption that the current state depends on only a finite fixed number of previous states.
Markov Assumption
What do you need to construct a Markov Chain?
Transition Model
It is a type of a Markov model for a system with hidden states that generate some observed event.
Hidden Markov Model
Sometimes, the AI has some measurement of the world but no access to the precise state of the world. In these cases, the state of the world is called the ________________ and whatever data the AI has access to are the ____________________.
Hidden state and observations
Give an example of a hidden state and its observation.
- For a robot exploring uncharted territory, the hidden state is its position, and the observation is the data recorded by the robot’s sensors.
- In speech recognition, the hidden state is the words that were spoken, and the observation is the audio waveforms.
- When measuring user engagement on websites, the hidden state is how engaged the user is, and the observation is the website or app analytics.
- Our AI wants to infer the weather (the hidden state), but it only has access to an indoor camera that records how many people brought umbrellas(observation?) with them.
What is another term for sensor model?
emission model
The assumption that the evidence variable depends only on the corresponding state.
Sensor Markov Assumption
What are the multiple tasks that can be achieved based on hidden Markov models?
Filtering, Prediction, Smoothing, Most Likely Explaination
Hidden Markov Model Tasks:
given observations from start until now, calculate the probability distribution for the current state.
Filtering
Hidden Markov Model Tasks:
given observations from start until now, calculate the probability distribution for a future state.
Prediction
Hidden Markov Model Tasks:
given observations from start until now, calculate the probability distribution for a past state.
Smoothing
Hidden Markov Model Tasks:
given observations from start until now, calculate most likely sequence of events.
Most likely explanation:
Can a hidden Markov model be represented using a Markov chain?
Yes.
From CS50 quiz
Consider a standard 52-card deck of cards with 13 card values (Ace, King, Queen, Jack, and 2-10) in each of the four suits (clubs, diamonds, hearts, spades). If a card is drawn at random, what is the probability that it is a spade or a two?
* About 0.019
* About 0.077
* About 0.17
* About 0.25
* About 0.308
* About 0.327
* About 0.5
* None of the above
Note that “or” in this question refers to inclusive, not exclusive, or.
About 0.308
From CS50 quiz
Imagine flipping two fair coins, where each coin has a Heads side and a Tails side, with Heads coming up 50% of the time and Tails coming up 50% of the time. What is probability that after flipping those two coins, one of them lands heads and the other lands tails?
0.5 = 1/2
From CS50 quiz
Two factories — Factory A and Factory B — design batteries to be used in mobile phones. Factory A produces 60% of all batteries, and Factory B produces the other 40%. 2% of Factory A’s batteries have defects, and 4% of Factory B’s batteries have defects. What is the probability that a battery is both made by Factory A and defective?
* 0.008
* 0.012
* 0.02
* 0.024
* 0.028
* 0.06
* 0.12
* 0.2
* 0.429
* 0.6
* None of the above
0.012