Probability Flashcards
Sample space and Events (Def)
The set of all possible outcomes in an experiment is called a SAMPLE SPACE (denoted S or Ω).
EVENTS are any possible subset of S
Categories of Sample Spaces
There are 3 categories of Sample Spaces:
- FINITE number of elements
- INFINITE COUNTABLE
- INFINITE UNCOUNTABLE
Independent events (Def)
Two events E1 and E2 are independent if the outcome of E1 does not affect the outcome of E2, and viceversa
Multiplication principle
Suppose we have n independent events E_1, E_2, … , E_n. If event E_k has m_k possible outcomes (for k = 1, 2, … , n), the there are
m_1 * m_2 * … * m_n
Possible ways for these events to occur
k-Permutations w/o repetition
A way of selecting k objects from a list of n.
- The order of selection matters
- Each object can be selected only once
k-Permutations w/ repetition
A way of selecting k objects from a list of n.
- The order of selection matters
- Each object can be selected more than once
Combinations w/o repetition
A way of selecting k objects from a list of n.
- The order of selection does NOT matter
- Each object can be selected only once
Aka n-choose-k
This is also the BINOMIAL COEFFICIENT
Combinations w/ repetition
A way of selecting k objects from a list of n.
- The order of selection does NOT matter
- Each object can be selected more than once
It’s aka the MULTISET COEFFICIENT
Combinations w/ repetition (ice cream example)
Definition of Probability
For a given experiment with Sample Space S, PROBABILITY is a real-valued function:
P: S -> [0, 1]
For each subset E that belongs to S, the function P assigns a number P(E), such that P(E) € [0, 1]
Axioms of Probability
The re are 4 axioms:
(1) For any event E subset of S, 0 <= P(E) <= 1
(2) P(S) = 1
(3) For two disjoint events (such that their intersection = 0), P(EuF) = P(E) + P(F)
(4) More generally, (3) works for n mutually exclusive events too
Conditional probability formula
Mutual independence (Def)
Events A, B and C are MUTUALLY INDEPENDENT if:
- P(A&B&C) = P(A)P(B)P(C)
And
- Pairwise independent
Law of Total Probability (Formula)
If {E_1, E_2, … , E_k} are a PARTITION of S, then for any event A:
Bayes’ rule (Formula)
Odds (Formula)
Historically, the likelihood of an event B has been expressed as a ratio between the prob of B and the prob of non-B
Odds w/ Bayes’ Theorem
Random Variable (Def)
A RANDOM VARIABLE is a function
X: S -> IR
For each element s of S, X(s) is a real number in IR
Range space / Support (Def)
The RANGE SPACE (or Support) R_x of a random variable X is the set of all possible realisations of X
( X(s) for every s € S )
Probability Mass Function (Def)
The PMF of a discrete random variable X is a function
f: R_x -> (0, 1]
Such that
f(x) = P(X=x) = p_x(x)
for each x € R_x such that
f(x) > 0
Σ_(x€R_x) f(x) = 1
P(X€A) = Σ_(x€A) f(x)
for some event A
Expected value (Formula)
Expected value (Formula)
Variance (Formula)
Cumulative Distribution Function (Formula)
Bernoulli distribution (Def)
An experiment that can take two values, 1 (success) and 0 (failure), with
P(1) = θ P(0) = 1-θ
Binomial distribution (Def)
Repeated experiment n times, with each experiment a Bernoulli variable
Poisson distribution (Def)
Describes the number of events occurring within a given interval, with rate λ.
It is commonly used to describe count data
Joint PMF (Formula)
Joint PMF (Key properties)
Marginal distribution (Def and Formula)
Let X_i be the i-th component of a k-dimensional random vector X. The distribution function F_(X_i)(x) of X_i is called the MARGINAL DISTRIBUTION of X_i.
For a bivariate discrete r.v., it’s PMF is:
Conditional PMF (Formula)
Conditional vs. Marginal dist (Vis)
Covariance (Formula)
Covariance (Key properties)
(1) A positive value indicates a positive LINEAR relationship, and viceversa
(2) Zero indicates the variables are LINEARLY INDEPENDENT
Note:
Correlation (Def and Formula)
Correlation is a measure of how strong the linear relationship is between two random variables
Independence of r.v. (Def and Key properties)
Two r.v.s X and Y are independent if all events relating to X are independent of all events relating to Y.
The following statements are equivalent:
(1) X and Y are independent
(2) The JOINT PMF of X and Y is the product of the MARGINAL PMFs
(3) The CONDITIONAL distribution of X given Y=y does not depend on y, and viceversa
Multinomial distribution (Def)
n independent trials with k possible outcomes for each trial. Each time, the probability of observing the j-th outcome is θ_j. Denote by X_j the number of times we observe the j-th outcome.
X = [ X_1, X_2, … , X_k]
X_1 + X_2 + … + X_k = n
θ_1 + θ_2 + … + θ_k = 1
Multinoulli (n=1) distribution (E and Var)
Multinomial distribution (Joint PMF)
Multinomial distribution (E and Var)
Multinoulli (n=1) distribution (Joint PMF)
Multinoulli (n=1) distribution (Def)
Multinoulli is a multinomial distribution when n=1, i.e. there is only 1 trial, but still k possible outcomes
Multinomial’s relationship to the Binomial dist
The Binomial is a special case of the Multinomial, where k=2 (i.e. only 2 possible outcomes).
If X ~ Binom(n, θ), then
X = (X, n-X) ~ Mu(n, (θ, 1-θ))
The transformation theorem
Using the joint PMF/PDF it’s possible to find the expected value of any real function g(X, Y) of X and Y.
Let X, Y be a pair of discrete r.v. and g(X, Y) be any real-valued function of X and Y.
Then if it exists, the expected value of g(X, Y) is defined to be:
Probability Density Function (Def)
For any a<=b, the probability P(a
Continuous CDF (Formula)
Normal distribution (Def)
Calculating probabilities for the Normal distribution
You can calculate P(X<=x) in two stages, using the Standard Normal dist
Z ~ N(0, 1)
(1) Transform P(X<=x) into P(Z<=z)
(2) Use the CDF of Z to calculate probabilities
Uniform distribution (Def)
X has a uniform distribution over the interval [a, b], written
X ~ U(a, b)
If it has PDF and CDF
Uniform distribution (Var and E)
Exponential distribution (Def)
X has an exponential dist with parameter λ>0, if it has PDF and CDF
Exponential dist (Var and E)
Joint PDF for bivariate continuous r.v. (PDF and Key properties)
Marginal PDF for bivariate continuous r.v. (Formula)
Marginal PDF for bivariate continuous r.v. (Examples)
E and Var of a sum of r.v.
Conditional dist of bivariate continuous r.v. (PDF)
Conditional dist of bivariate continuous r.v. (Expected value)
Law of Iterated Expectations
Conditions for two r.v. To be independent
(1) f_XY(x, y) = f_X(x) * f_Y(y)
(2) The Joint PDF factorizes into:
f_XY(x, y) = C * g(x) * h(y)
With C some constant (the factorization is not unique)
(3) f_X|Y(x|y) = f_X(x) and viceversa
Conditions (1) and (2) require that the joint range space R_XY is the cartesian product of R_X and R_Y.
If (2) holds, then the Marginal PDFs of X and Y are proportional to g(x) and h(y), respectively
Joint CDF for bivariate r.v. (Formula)
Standard MVN - Multivariate Normal distribution (E and Var)
MVN - Multivariate Normal distribution (E and Var)
MVN (PDF)
Marginal distributions of MVN
Conditional distributions of MVN (Formula)
Law of Large numbers (Def)
Chebyshev’s WLLN
Kolmogorov’s SLLN
CLT (Def)
Let {X_n} be a sequence of r.v.s. Let X_n-bar be the sample mean of the first n terms of the sequence.
A CLT is a proposition giving a set of conditions to guarantee the convergence of the sample mean to a NORMAL DIST, as the sample size increases, i.e. sufficient to guarantee that
CLT (Steps to use)
The CLT is used as follows:
(1) we observe a sample consisting of n observations X_1, X_2, … , X_n
(2) If n is large enough, then a standard normal distribution is a good approximation of the distribution of
sqrt(n) * (X_n-bar - μ) / σ
(3) Therefore, we pretend that
sqrt(n) * (X_n-bar - μ) / σ
~ N(0, 1)
(4) As a consequence, the distribution of the sample mean is
CLT (Equivalent form)
CLT - Normal approximation of the binomial
Let X_1, X_2, … , X_n be a sequence of iid Bernoulli(θ) r.v.s. We know that:
- E[X_i] = θ and
Var[X_i] = θ*(1-θ) - X = Σ(X_i) ~ Binom(n, θ)
The CLT tells us that:
CLT - Normal approximation of the Poisson