Probability & Statistics Flashcards
Definition of a Sample Space
The sample space is the set of all possible outcomes for the experiment. It is denoted by S
Definition of an Event
An event is a subset of the sample space. The event occurs if the actual outcome is an element of this subset.
Definition of a Simple Event
An event is a simple event if it consists of a single element of the sample space S
Meaning of Disjoint events
We say two sets A and B are disjoint if they have no element in common, i.e, A ∩ B = {}
De Morgan’s laws
(A ∪ B)c = Ac ∩ Bc
(A ∩ B)c = Ac ∪ Bc
Kolgromov’s axioms for probability
a) For every event A we have P(A) >= 0,
b) P(S) = 1,
c) If A1, A2, …, An are n pairwise disjoint events
then
P(A1 U A2 U … U An) = P(A1) + P(A2) + … + P(An)
Complement Rule for Probability
If A is an event then
P(Ac) = 1 - P(A)
Probability of an Empty Set
P(∅) = 0
Probability of an Event Upper Bound
If A is an event then P(A) <= 1
Probability of a Subset
If A and B are events and A ⊆ B then
P(A) <= P(B)
Probability of a Finite Event
The probability of a finite set is the sum of the probabilities of the corresponding simple events.
Inclusion-exclusion for two events
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
Inclusion-exclusion for three events
P(A∪B∪C) = P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C)
Ordered with replacement (repetition allowed)
n^r
Ordered without replacement (no repetition)
n!
(n−r)!
Conditional Probability
If E1 and E2 are events and P(E1) /= 0 then the conditional probability of E2 given E1, usually denoted by P(E2|E1), is
P(E2|E1) = P(E1 ∩ E2)
P(E1)
Unordered without replacement (no repetition)
nCr (n)
(r)
Defenition of Independence
We say that the events E1 and E2 are (pairwise) independent if
P(E1 ∩ E2) = P(E1)P(E2)
When are three events E1, E2, and E3 called pairwise independent
P(E1 ∩ E2) = P(E1)P(E2),
P(E1 ∩ E3) = P(E1)P(E3),
P(E2 ∩ E3) = P(E2)P(E3).
When are three events E1, E2, and E3 called mutually independent
P(E1 ∩ E2 ∩ E3) = P(E1)P(E2)P(E3)
When are two events E1 and E2 are said to be conditionally independent given an event E3
P(E1 ∩ E2|E3) = P(E1|E3)P(E2|E3)
Definition of Random Variable
A random variable is a function from S to R
Definition of Discrete Random Variables
A random variable X is discrete if the set of values that X takes
is either finite or countably infinite.
Definition of Probability Mass Functions
The probability mass function (p.m.f.) of a discrete random
variable X is the function which given input x has output P(X = x)
Sum of Probabilities for a Discrete Random Variable
The sum of the Outputs must equal 1
Definition of Expectation
If X is a discrete random variable which takes values x1, x2, x3, . . ., then the expectation of X (or the expected value of X) is defined by
E(X) = x1P(X = x1) + x2P(X = x2) + x3P(X = x3) + · · · .
Bound on Expectations of a Random Variable
m ≤ E(X) ≤ M
Expectation of a function
E( f(X) ) = f(x1)P(X = x1) + f(x2)P(X = x2) + f(x3)P(X = x3) + · · ·
Definition of Moments
The nth moment of the random variable X is the expectation E(X^n)
Definition of Variance
Var(X) = [x1 − E(X)]2P(X = x1) + [x2 − E(X)]2P(X = x2)
+ [x3 − E(X)]2P(X = x3) + …
Variance formula
Var(X) = E(X^2) − [E(X)]^2
Linear function of expectation
E(aX + b) = aE(X) + b
Linear function of variance
Var(aX + b) = a^2Var(X)
What is a Bernoulli(p) distribution:
It is where a random variable X only takes values 0 and 1
Bernoulli distribution Expectation and Variance
E(X) = p, Var(X) = p(1 − p)
What is Binomial distribution:
A discrete random variable X has the Binomial(n, p) distribution, denoted X ∼ Bin(n, p), if its p.m.f. is :
P(x =k) = nCk x p^k x (1-p)^n-k
Binomial distribution Expectation and Variance
E(X) = np, Var(X) = np(1 − p)
What is Geometric distribution:
A discrete random variable X has the Geometric(p) distribution, denoted X ∼ Geom(p), if its p.m.f. is :
P(X = k) = p(1 − p)^k−1
Geometric distribution Expectation and Variance
E(X) = 1/p
Var(X) = 1 − p/ p^2
What is Hypergeometric distribution
the hypergeometric distribution describes the probability of successes in draws, without replacement, from a finite population
P(X = k) = mCk x (n-m)C(l-k) / nCl
Hypergeometric distribution Expectation and Variance
E(X) = l x (m/n)
Var(X) = l x (m/n) x (n-m/n) x (n-l/n-1)
What is Negative binomial distribution
The negative binomial distribution models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified number of successes occurs
P(X = k) = (k+r-1)C(r-1) x p^r x (1-p)^k
Negative Binomial distribution Expectation and Variance
E(T) = (1-p)r/p
Var(T) = (1-p)r/p^2
What is Uniform distribution
uniform distribution refers to a type of probability distribution in which all outcomes are equally likely
P(X=k) = 1/(n+1) if m<=k<=n+m
(discrete) Uniform distribution Expectation and Variance
where n = b-a, and m=a
E(X) = m +(n/2)
Variance = n(n+2)/12
What is Poisson distribution
Poisson distribution expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate.
P(X=k) = (λ^k/k!) x e^-λ
Poisson distribution Expectation and Variance
E(X) = λ, Var(X) = λ
Cumulative distribution function
The cumulative distribution function (c.d.f.) of a random variable X is the function which given t has output
P(X ≤ t).
Moment Generating function
Let X be a discrete random variable
which takes integer values. The moment generating function (mgf ) of X is the function which given t has output E(e^(tX))
Definition of a Continuous random variable
We say that a random variable X is a continuous random variable if there exists a
continuous function fx from R to [0, ∞) with the following property:
P(a<= X <= b) = integral of fx(t)
Expectation and Variance of crv
E(X) = integral tfx(dt)
Var(X) = E(X^2) − (E(X))^2
(continuous) Uniform Expectation and Variance
E(X) = (a+b)/2
Var(X) = (b-a)^2/12
Exponential distribution
It is often used to model the time elapsed between events
P(X=t) = λe^λt
Exponential distribution Expectation and Variance
E(X) = 1/λ
Var(X) = 1/λ^2
Joint Probability mass function
Let X and Y be two discrete random variables defined on the same sample space and taking values x1, x2, . . . and y1, y2, . . . respectively. The
function:
(xk, yl) → P( (X = xk) ∩ (Y = yl) )
Marginal Probability
P(X=xk) = suml P(X = xk, Y = yl)
same goes the other way
the idea is that if we only care about the probability of X
taking a particular value, we need to sum over all possible values of Y
Expectations of 2 Variables
If g(X, Y ) is a real-valued function of the two discrete random variables X and Y then the expectation of g(X, Y ) is obtained as
E( g(X, Y ) ) = sumk suml g(xk, yl)P(X = xk, Y = yl)
Linearity of Expectation with multiple variables
If X and Y are discrete random variables then
E(X + Y ) = E(X) + E(Y )
Independence for Random Variables
Two discrete random variables X and Y are independent if
the events “X = xk” and “Y = yl” are independent for all possible values xk, yl
Covariance of X, Y
The covariance of X and Y is defined by:
Correlation coefficient of X and Y
Formula for Covariance (easy)
Cov(X, Y ) = E(XY ) − E(X)E(Y )
Normal distribution formula
Normalisation
Using the substitution z = (x − µ)/σ one can confirm that the
p.d.f. is normalised
Standardisation
When you standardise a normal distribution, the mean becomes 0 and the standard deviation becomes 1
z-scores
Let Z ∼ N(0, 1) denote a standard normal random variable.
Quartiles and the Median
Z-Score and Standard Normal Distribution Lemma
Moment Generating Function of a Normal Random Variable
Properties of Expectation and Variance for Independent Discrete Random Variables
Variance of a Linear Combination of Independent Discrete Random Variables
Properties of Correlation for Discrete Random Variables
Central Limit Theorem
Sum of Independent Normal Random Variables
If X 1 ,X 2 ,…,X n are independent normal random variables with mean μ and variance
σ2, the sum X=X1+X2+⋯+Xn is also a normal random variable. The distribution of X is given by X∼N(nμ,nσ2), meaning the mean of
X is nμ and the variance of X is 2 nσ2
What is a survey?
A survey is the collection of data from a sample of the population.
What is an observational study?
In an observational study researchers observe the behaviour of individuals
without trying to influence the outcome of the study
What is an experiment?
In a designed experiment researchers apply some treatment to the units under
investigation and measure the response.
Quantitative data
Continuous variables/data are variables which are given in
terms of real numbers
Discrete variables/data are variables which are given by integers
Qualitative data
Categorical variables/data are
variables which are expressed in terms of categories
Ordinal variables/data are variables which are expressed in
terms of ordered categories
Sample Mean
This is called the estimator of the mean where x bar is called the point estimate
Median
If n is odd Q2 = x (n+1/2)
If n is even Q2 = average (x(n/2),x((n/2) +1)
Population Variance
This is a biased estimator
Sample Variance
Interquartile Range
Five-number summary
mx, Q1, Q2, Q3, Mx
Sample covariance
Sample linear correlation coefficient
What is a statistic?
Expectation and Variance of Sample Mean
Sample Proportion
Unbiased Estimator
An estimator of a given parameter is said to be unbiased if its expected value is equal to the true value of the parameter
Bias in Estimation of Variance
That means Σ2is biased and it slightly underestimates the variance
Mean Square error
Measures the average squared difference between the estimated values and the actual value
Confidence interval for the sample mean
Level of confidence
The fraction of 1−α of such intervals contain the population mean µ. We call 1−α the level of confidence
Confidence interval for the sample proportion
What is a hypothesis?
A hypothesis is a statement about a parameter θ of
the pmf (or pdf ) of a random variable X
Null Hypothesis
In hypothesis testing the central claim, the null hypothesis H0 is a statement θ = θ0 which we intend to find evidence against
Significance level
Assume the null hypothesis H0 is valid. We say a type-I error has occurred if the test procedure for H0 rejects the null hypothesis. The probability of a type-I error occurring is called the significance level α of the
test procedure
Alternative hypothesis
The test procedure tests the null hypothesis H0 against the so called alternative hypothesis H1. The alternative hypothesis specifies under which conditions the null hypothesis should be rejected.
Two-tailed test
If we test H0 : θ = θ0 against H1 : θ /= θ0 we need a two-sided
test.
Right-tailed test
If we test H0 : θ = θ0 against H1 : θ > θ0
Left-tailed test
If we test H0 : θ = θ0 against H1 : θ < θ0
Type-I error
A type-I error occurs if the test rejects the null hypothesis H0 even though H0 is valid
Type-II error
the test does not reject H0 even though in some sense H0 is false
Definition of the Power
The probability for a type-II error is denoted
by β. 1−β is called the power of the test
Definition of P-value
the P-value of the observation is the probability to observe a sample statistic as extreme or more extreme as the observation under the assumption that
the null hypothesis is true
At significance level α the null hypothesis is rejected if the P-value obeys P < α.
Interpretation of P-values
Implications of Conditional Probability
The multiplication rule
The partition of events
Law of total probability
Total Probability for Conditional Events
Bayes’ Theorem
Conditional expectation
Law of total probability of expectations