Chapter 2- Probabilistic Models Flashcards
what is a random variable?
a numeric quantity whose values map to the possible outcomes of an experiment
what is a sample space, or alphabet?
consists of all possible events
what is an event?
any subset of values that X can take
what is the first rule of probability theory?
probabilities add up to 1
what is the difference between estimated and true probabilities?
an estimate comes from a sample- it is a sample estimate
what is another name for the true probability?
a population parameters
what is joint probability?
AND
from conditional probabilities, p(x,y) = ?
p(x|y)p(y) and vice versa, p(y|x)p(x)
what is the rule for independent events?
p(x,y) = p(x)p(y)
how is bayes theorem derived?
p(x,y) = p(x|y)p(y)
and vice versa p(x,y) = p(y|x)p(x)
equate these
what is another word for joint probability?
marginalisation
what is the formula for joint probability, p(X=x)?
sum for each y: p(x|y)p(y)
what is the conditional independence assumption? (in words)
the features are conditionally independent of each other, given the class value
what is the conditional independence assumption p(x1,x2,x3|y) = ?
p(x1|y)p(x2|y)p(x3|y)
what is the conditional independence assumption p(X|y) = ?
multiply for each x: p(x|y)
what is the naive bayes model?
we solve problems by making the conditional independence assumption
a bayesian network is an example of?
a directed acyclic graph
when might over/underfitting occur in a bayesian network?
if we have more links, the more complicated the probability distribution and hence more data is needed
what is one of the great advantages of bayesian networks?
you can very naturally encode human knowledge about a given problem
what is the chain rule for probability: P(A n B) = ?, and P(A1 n A2 n A3 n A4) = ?
P(A n B) = p(B|A)P(A)
or for more
P(A1 n A2 n A3 n A4) = p(A4 | A1 n A2 n A3)
= P(A4 | A3 n A2 n A1) p(A3 | A2 n A1)p(A2 | A1) p(A1)
what does the chain rule tell us how to compute? Events A and B
The chain rules tells us how to compute the probability of event A happening AND then event B occurring afterwards.
what is a random experiment?
random experiment is any experiment in which the outcome is uncertain (not known or determined in advance).
what is P(A or B) if
a) they are disjoint
b) they are joint
a) P(A) + P(B)
b) P(A) + P(B) - P(A and B)
what is a probability mass function?
discrete random variables take on a finite number of values.
Each value is associated with a probability of it occurring. t
The collection of these probabilities is the probability mass function
give the bernoulli distribution
P(X = 0) = 1 - p p(X = 1) = p
give the binomial distribution, P(X=k) =
P(X = k) = (nCk)(p^k)(1-p)^(n-k)
when would we use the binominal distribution
if experiment is repeated n times, the probability that we will see k successes is given by this probability mass function
give the geometric distribution
P(X=x) = (1-p)^x-1 (p)
when would we use the geometric distribution
it is useful for modelling the first occurrence of an outcome after repeated identical trials
give the poisson distribution
P(X=x) = { lambda^x e(-lambda) } / x!
when would we use the poisson distribution
if we had a rate e.g. 5 times a year
if a discrete r.v. X has a pmf f(X) what is the expected value E[g(x)]
sum i: g(Xi)f(Xi)
if a discrete r.v. X has a pmf f(X) what is the variance V[g(x)]
E[(g(X) - E(g(X)))^2]
E[g(X)^2] - E[g(X)]^2
properties of Expectations
E[aX + b] =
aE[X] + b
properties of variance:
V[aX+b] =
a^2V[X]
what are the two schools of probability?
frequency based
belief based
Describe the gamblers fallacy and how it demonstrates iid
A sequence of outcomes of spins of a fair or unfair roulette wheel is i.i.d
The outcome of the previous turn doesnt impact the next turn
The distribution of probabilities is the same each time i.e. 50/50 if not biased.
So, even if we’ve had 20 reds, there is still the same chance of having a black vs red next go
Let the two events be the probabilities of persons A and B getting home in time for dinner, and the third event is the fact that a snow storm hit the city. If they live in different areas, are A and B conditionally independent given C?
Yes. That is, the knowledge that A is late does not tell you whether B will be late.
If they lived in the same neighbourhood they would not be.
if two events are independent, are they also conditionally dependent?
not necessarily, it depends on the third event.
rolling a dice twice are two independent events. if the third event was the sum of them being 7, they would then be independent but not conditionally independent given c