Probability distributions - (3-4) Flashcards
Binomial distribution
In situations characterized by repeated trials where:
* The trials are independent
* We are recording “success” or not “success” in each trial (whether a specific event occurs or not)
* The probability of “success” is the same in each trial, p
* A specified number of trials, n
X = “the number of successes in the n trials”. X ∼ Bin(n, p).
Examples:
* The number of components of the same type functioning after time t.
* The number of persons being against EU in a random sample of people.
* The number of patients which gets a certain infection at a hospital unit.
* The number of times we get 6 in n throws of a dice.
* The number of seeds in a packet of seeds which germinate.
Multinomial distribution
Multinomial distribution:
* The trials are independent
* Each trial has k possible outcomes
* The probabilities, p1, …, pk for each of the k outcomes are the same in all trials
* A specified number of trials, n
Hypergeometric distribution
- Have a total of N units
- k of the N units are “successes” (and then N − k “not successes”)
- We draw n units without replacement
X = “the number of successes among the n drawn units”
Approximation to binomial: Ok when N ≥ 20n!
Examples:
* The number of winning tickets in a hand of lottery tickets.
* The number of spades in a hand of cards.
* The number of defective goods found in a spot test.
Negative binomial
- Independent trials
- “Success” or not “success” in each trial
- The probability of “success” is the same in each trial, p
- Repeat the trials until a specified number of successes, k, are observed
X = “the number of trails until success number k”
Examples:
* The number of components which needs to be fabricated until we have k components without failures.
* The number of houses a vacuum cleaner seller needs to visit until he has sold k vacuum cleaners.
* The number of persons you have to ask before you get k signatures to a signature campaign.
* The number of times you have to throw a dice until you get 6 for the kth time.
Geometric distribution
- The trials are independent
- We are recording “success” or not “success” in each trial
- The probability of “success” is the same in each trial, p
- We repeat the trials until the first success
X = “the number of trials until the first success”
Example:
* The number of times an oil drilling company needs to drill at a certain location until the first time they find oil.
Central limit theorem (CLT)
General description
A probability distribution specifies how the outcome of the random variable will be distributed if the experiment is repeated many times. We can think of the pmf/pdf as a histogram of infinitely many repetitions of the experiment.
Different types of probability distributions are suitable for describing different phenomenon. The parameters in the probability distribution specify the exact shape and location of the pmf/pdf. Properties of a distribution like expectation and variance will always be functions of the parameters.
When the type of distribution and the parameter values are known, everything of interest can be calculated (expectation, variance, probabilities, etc).
In practice the parameter values will often be unknown, and we then want to find a best possible estimate from available information. Estimation is about this.
For discrete distributions it is often possible to decide from information about the situation studied which distribution should be used. For continuous distributions it may be more challenging to decide which distribution is best suited for describing the phenomenon. In addition to knowledge about the phenomenon etc, there are various plots that could be used to decide from observed data whether an assumed distribution is reasonable (e.g. normal probability plots to check normality etc).
The relationship between exponential distribution and poisson processes
There are in particular some important relations between the exponential distribution and Poisson processes.
- The time until the first event in a Poisson process is having an exponential distribution.
- The time between events in a Poisson process is having an exponential distribution.
- The time until event number k occurs in a Poisson process is having a gamma distribution with parameters α = k and β = 1/λ.
Approximations
What distributions are memoryless?
Exponential distribution
Geometric distribution
Poisson distribution is not memoryless. It is the distribution of the waiting times in the Poisson process that is memoryless.
Command for finding Z probabilities (normal) in R
pnorm(“Z-value”)
The time until the first event in a Poisson process has a … distribution.
The time until the second event in a Poisson process, S2, has a … distribution.
Exponential
Gamma
ppois
pbinom
pgamma