Probability Flashcards
What is a sample space and how do you write it?
The set of all possible outcomes, eg.
throwing two dice: Ω = {(i, j) : 1 ≤ i, j ≤ 6}
tossing a coin: Ω = {H, T}
What is a subset of Ω (sample space) called?
An event
When are two events disjoint?
A ∩ B = ∅
When they cannot both occur
What is Stirling’s formula for the approximation of n!?
n! ∼()√2πn^(n+ 1/2)e^(−n)
What is the formula for the number of the arrangements of n objects, with repeats?
Eg,
a₁, …, a₁, a₂,…, a₂,…aₖ, …, aₖ
where a₁ is repeated m₁ times etc.
n!/(m₁!m₂!…m!)
What is the multinomial coefficient?
The coefficient of a₁ᵐ¹….aₖᵐᵏ
in (a₁ + … + aₖ)^n where m1 + … + mk = n
nC(m₁m₂…mₖ)
- How many distinct non-negative integer-valued solutions of the equation
x₁ + x₂ + · · · + xₘ = n
are there?
(n+m-1)Cn
What is Vandermonde’s identity?
For k, m, n ≥ 0
(m+n)Ck = ᵏΣⱼ₌₀(mCj)(nC(k-j))
mCj = 0 for j>m
Prove Vandermonde’s identity
Suppose we choose a committee consisting of k people from a group of m men and n women.
There are (m+n)Ck
ways of doing this which is the left-hand side.
Now the number of men in the committee is some j ∈ {0, 1, . . . , k} and then it contains k − j women.
The number of ways of choosing the j men is mCj
and for each such choice there are nC(k-j)
choices for
the women who make up the rest of the committee. So there are mCj * nC(k-j)
committees with exactly j
men and summing over j we get that the total number of committees is given by the right-hand side
A probability space is a triple (Ω, F, P)
(Fancy F and P).
What do these symbols mean?
- Ω is the sample space
- F is a collection of subsets of Ω, called events, satisfying axioms F1–F3
- P is a probability measure, which is a function P : F → [0, 1] satisfying axioms P1–P3
What is the probability of the union of two disjoint events?
eg, P(A ∪ B)
P(A ∪ B) = P (A) + P (B)
What are the axioms on F (a collection of subsets of Ω)?
F1: ∅ ∈ F.
F2: If A ∈ F, then also Aᶜ ∈ F.
F3: If {Ai, i ∈ I} is a finite or countably infinite collection of members of F, then ∪ᵢ∈ᵢ Aᵢ ∈ F
What are the axioms of P, where P is a function from F to R?
P1: For all A ∈ F, P(A) ≥ 0. P2: P(Ω) = 1 P3: If {Ai, i ∈ I} is a finite or countably infinite collection of members of F, and Ai ∩ Aj = ∅ for i ≠ j, then P(∪ᵢ∈ᵢ Aᵢ) = Σi∈I P(Ai)
When Ω is finite or countably infinite, what do we usually take F to be?
We normally
take F to be the set of all subsets of Ω (the power set of Ω)
Suppose that (Ω, F, P) is a probability space and that A, B ∈ F If A ⊆ B then P (A) ≤
A ⊆ B then P (A) ≤ P (B)
Prove that P (A’) = 1 − P (A) using the probability axioms
Since A ∪ A’ = Ω and A ∩ A’ = ∅, by P3, P (Ω) = P (A) + P (A’). By P2, P (Ω) = 1 and so P(A) + P (A’) = 1, which entails the required result
Prove A ⊆ B then P (A) ≤ P (B) using the probability axioms
Since A ⊆ B, we have B = A ∪ (B ∩ A’). Since B ∩ Ac ⊆ A’, it must be disjoint from A. So by P3, P(B) = P(A) + P(B ∩ A’). Since by P1, P(B ∩ A’) ≥ 0, we thus have P (B) ≥ P(A)`
Conditional Probability
What is the probability of A given B?
P(A|B) = P(A ∩ B)/P(B)
Let (Ω, F, P) be a probability space and let B ∈ F satisfy P(B) > 0. Define a new function Q : F → R by Q(A) = P(A|B)
Is (Ω, F, Q) a probability space?
Prove your result
Yes
Proof pg 12
When are events A and B independent?
Events A and B are independent if P(A ∩ B) = P(A)P(B)
More generally, a family of events A = {Aᵢ : i ∈ I} is independent if…
P(∩ᵢ∈ⱼ Aᵢ) = Πᵢ∈ⱼ P(Aᵢ)
for all finite subsets J of I
When is a family of events pairwise independent?
A family A of events is pairwise independent if P(Aᵢ ∩ Aⱼ ) = P(Aᵢ)P(Aⱼ ) whenever i ≠ j.
Does Pairwise Independence imply independence?
NO!!!!
Given A and B are independent, are A and B’, and A’ and B’ independent?
Both A and B' and A' and B' are independent
Prove that A and B’ are independent given A and B are independent
We have A = (A ∩ B) ∪ (A ∩ B’), where A ∩ B and A ∩ B’ are disjoint, so using the
independence of A and B, P(A ∩ B’) = P (A) − P(A ∩ B) = P(A) − P(A) P(B) = P (A) (1 − P(B)) = P(A)P(B’)
When is a family of events {B1, B2, . . .} a partition of Ω?
if
- Ω = ∪ᵢ≥₁ Bᵢ (so that at least one Bi must happen), and
- Bᵢ ∩ Bⱼ = ∅ whenever i ≠ j (so that no two can happen together)
What is the law of total probability/partition theorem?
Suppose {B1, B2, . . .} is a partition of Ω by sets from F,
such that P (Bᵢ) > 0 for all i ≥ 1. Then for any A ∈ F
P(A) = ᵢ≥₁ΣP(A|Bᵢ)P(Bᵢ)
Prove the partition theorem
P(A) = P(A ∩ (∪ᵢ≥₁Bᵢ)), since ∪ᵢ≥₁Bᵢ = Ω
= P(∪ᵢ≥₁(A ∩ Bᵢ))
= ᵢ≥₁Σ P (A ∩ Bᵢ) by axiom P3, since A ∩ Bᵢ, i ≥ 1 are disjoint
= ᵢ≥₁Σ P (A|Bᵢ)P(Bᵢ)
What is Bayes’ Theorem?
Suppose that {B1, B2, . . .} is a partition of Ω by sets from F such that P (Bi) > 0 for all i ≥ 1. Then for any A ∈ F such that P (A) > 0
P(Bₖ|A) = P(A|Bₖ)P(Bₖ)/(ᵢ≥₁Σ P (A|Bᵢ)P(Bᵢ))
Prove Bayes’ theory
We have P(Bₖ|A) = P(Bₖ ∩ A)/P(A)
= P(A|Bₖ)P(Bₖ)/P(A)
Now substitute for P(A) using the law of total probability
What is Simpson’s paradox?
it consists of the fact that for events E,
F, G, we can have
P(E|F ∩ G) > P(E|F’ ∩ G)
P(E|F ∩ G’) > P(E|F’ ∩ G’)
and yet
P(E|F) < P(E|F’).
What is the multiplication rule?
Eg, P(A ∩ B) = …
P(A ∩ B) = P(A|B) P(B) = P(B|A) P(A)
What is the generalisation of the multiplication rule for n events?
P (A1 ∩ A2 ∩ . . . ∩ An) = = P(A1) P(A2|A1). . . P(An|A1 ∩ A2 ∩ . . . ∩ An−1)
inclusion-exclusion formula
P (A1 ∪ A2 ∪ . . . ∪ An) = ⁿΣᵢ₌₁ P(Aᵢ) - ….
P (A1 ∪ A2 ∪ . . . ∪ An) = ⁿΣᵢ₌₁ P(Aᵢ) - Σᵢ>ⱼ P(Ai ∩ Aj) + … + (-1)ⁿ⁺¹P(A1 ∩ A2 ∩ . . . ∩ An)
What is a discrete random variable?
A discrete random variable X on a probability space (Ω, F, P) is a function X : Ω → R such that (a) {ω ∈ Ω : X(ω) = x} ∈ F for each x ∈ R, (b) ImX := {X(ω) : ω ∈ Ω} is a finite or countable subset of R
What is the more common/shorter way of writing P({ω ∈ Ω : X(ω) = x})?
P(X = x)
How is the probability mass function defined?
The probability mass function (p.m.f.) of X is the function pₓ : R → [0, 1] defined by
pₓ(x) = P(X = x)
What is the pmf when x ≠ ImX?
If x ≠ ImX X (that is, X(ω) never equals x) then pₓ(x) = P ({ω : X(ω) = x}) = P (∅) = 0.
What does Σₓ∈ᵢₘₓ pₓ(x) = ?
why?
ₓ∈ᵢₘₓΣ pₓ(x) = ₓ∈ᵢₘₓΣ P ({ω : X(ω) = x})
=P(ₓ∈ᵢₘₓ ∪ {ω : X(ω) = x}) since the events are disjoint
= P (Ω) since every ω ∈ Ω gets mapped somewhere in ImX
= 1
X has the Bernoulli distribution with parameter p (where 0 ≤ p ≤ 1) if…
P(X = 0) = 1 − p, P(X = 1) = p
X has a binomial distribution with parameters n and p (where n
is a positive integer and p ∈ [0, 1]) if…
P (X = k) = nCk p^k (1-p)^n-k
If X has the Bernoulli distribution, how do we write this?
X ∼ Ber(p)
If X has the binomial distribution, how do we write this?
X ∼ Bin(n, p)
If X has the geometric distribution, how do we write this?
X ∼ Geom(p)
If X has the Poisson distribution, how do we write this?
X ∼ Po(λ)
X has a geometric distribution with parameter p if….
P(X = k) = p(1 − p)^k-1, k = 1, 2, ....
What can the geometric distribution model?
We can use X to model the number of independent trials needed until we see the first success,
where p is the probability of success on a single trial
If you want to use the geometric distribution to model he number of failures before the first success, which formula do you use?
P (Y = k) = p(1 − p)^k,
k = 0, 1, …
X has the Poisson distribution with parameter λ ≥ 0 if…
P (X = k) = ( λ^k e^-λ) /k!, k = 0, 1, …
Define the expectation of X
The expectation (or expected value or mean) of X is E[X] = ₓ∈ᵢₘₓΣ xP(X=x) provided that ₓ∈ᵢₘₓΣ |x|P(X=x) < ∞
What is the expectation of the Poisson distribution?
λ
What is the expectation of the Geometric distribution?
1/p
What is the expectation of the Binomial distribution?
np
What is the expectation of the Bernoulli distribution?
p
Let h : R → R
If X is a discrete random variable, is Y = h(X) also a discrete random variable?
Yes
If h : R → R, then
E [h(X)] = ….
E [h(X)] = ₓ∈ᵢₘₓΣ h(x)P (X = x)
provided that ₓ∈ᵢₘₓΣ |h(x)|P (X = x) < ∞.
Prove the theorem that
E [h(X)] = ₓ∈ᵢₘₓΣ h(x)P (X = x)
Let A = {y : y = h(x) for some x ∈ ImX}
Start from the rhs. Write it as two sums, one over y∈A, the other over x∈ImX:h(x)=y
pg22
Take h(x) = x^k What is E[X^k] called?
The kth moment of X, when it exists
Let X be a discrete random variable such that E [X] exists.
Describe the expectation when X is non-negative
Prove it
If X is non-negative then E [X] ≥ 0
We have ImX ⊆ [0, ∞) and so
E [X] = ₓ∈ᵢₘₓΣ xP (X = x) is a sum whose terms are all non-negative and so must itself be non-negative.
Let X be a discrete random variable such that E [X] exists.
If a, b ∈ R then E [aX + b] = …
Prove it
E [aX + b] = aE [X] + b
For a discrete random variable X, define the variance
For a discrete random variable X, the variance of X is defined by var (X) = E[(X − E[X])² ] = E[X²] - (E[X])² provided that this quantity exists.
What is variance a measure of?
The variance is a measure of how much the distribution of X is spread out about its mean: the more
the distribution is spread out, the larger the variance.
Is always Var(X)≥ 0? Why?
Yes
since (X−E [X])2
is a non-negative random variable, var (X) ≥ 0
How are standard deviation and variance related?
Standard deviation^2 = var (X)
Suppose that X is a discrete random variable whose variance exists. Then if a and b
are (finite) fixed real numbers, then the variance of the discrete random variable Y = aX + b is given by ….
Prove it
var (Y ) = var (aX + b) = a² var (X)
Suppose that B is an event such that P (B) > 0. Then the conditional distribution of
X given B is…
P(X = x|B) =
P(X = x|B) = P({X = x} ∩ B) / P(B), for x ∈ R
Suppose that B is an event such that P (B) > 0,
The conditional expectation of X given B is…
ₓΣxP(X = x|B),
whenever the sum converges absolutely
We write pₓ|ᵦ(x) = P(X=x|B)
What is the Partition theorem for expectations?
If {B1, B2, . . .} is a partition of Ω such that
P (Bi) > 0 for all i ≥ 1 then
E [X] = ᵢ≥₁ΣE [X | Bᵢ] P(Bᵢ),
whenever E [X] exists.
Prove the Partition theorem for expectations
Use the total law of probability to split into two sums, one over x, one over i.
pg24
Given two random variables X and Y their joint distribution (or joint probability
mass function) is
pₓ,ᵧ (x, y) =
pₓ,ᵧ (x, y) = P ({X = x} ∩ {Y = y})
= P(X = x, Y = y)
x, y ∈ R