Week 15 - Rules of probability, and probability distributions Pt3 Flashcards
What is a random variable?
A random variable is a numerical description of the outcome of an
experiment
What is a discrete random variable?
A discrete random variable may assume either a finite number of
values or an infinite sequence of values
What is a continuous random variable?
A continuous random variable may assume any numerical value in an
interval or collection of intervals.
What are the 4 properties of a binomial experiment?
- The experiment consists of a sequence of n identical trials.
- Two outcomes, success and failure, are possible on each trial.
- The probability of a success, denoted by p, does not change from trial to trial.
- The trials are independent.
What is the interest in binomial distribution?
Our interest is in the number of successes occurring in the n trials.
We let X denote the number of successes occurring in the n trials
What does the binomial probability function look like?
𝑝(𝑥)= 𝑛!/(𝑥!(𝑛−𝑥)!) [𝜋^𝑥 (1−𝜋)^(𝑛−𝑥)]
Where:
p(x) = the probability of x success in n trials
n = the number of trials
𝜋= the probability of success on any one trial
What does this portion of the binomial probability function mean?
𝑛!/(𝑥!(𝑛−𝑥)!)
number of experimental outcomes providing exactly x successes in n trials
What does this portion of the binomial probability function mean?
𝜋^𝑥 (1−𝜋)^(𝑛−𝑥)
Number of experimental
outcomes providing exactly
x successes in n trials
What is the formula for the expected value (mean) of a binomial distribution?
𝐸(𝑋)= 𝜇=𝑛𝜋
What does the expected value in a binomial distribution represent?
The average number of successes in
n trials with success probability π.
What is the formula for the variance of a binomial distribution?
𝜎^2= 𝑛𝜋(1−𝜋)
What does the variance in a binomial distribution measure?
The variability in the number of successes.
What is the formula for the standard deviation of a binomial distribution?
𝜎= √(𝑛𝜋(1−𝜋) )
What does the standard deviation in a binomial distribution represent?
The spread of the number of successes around the mean, in the same units as
X.
Example
Evian is concerned about a low retention rate for employees. In recent years, management has seen a turnover of 10% of the hourly employees annually.
Thus, for any hourly employee chosen at random, management estimates a probability of 0.1 that the person will not be with the company next year.
Choosing 3 hourly employees at random, what is the probability that 1 of them will leave the company this year?
Using the Binomial Probability Function
p(x) = the probability of x success in n trials = 0.10
n = the number of trials = 3
𝑥= number of success = 1
𝑝(𝑥)= 𝑛!/(𝑥!(𝑛−𝑥)!) [𝜋^𝑥 (1−𝜋)]^(𝑛−𝑥)
𝑝(𝑥)= 3!/(1!(3−1)!) (0.10)^1 (1−0.10)^(3−1) = 0.243
Expected Value = 𝐸(𝑋)= 𝜇=𝑛𝜋 = 3*0.1 = 0.3 employees out of 3
Variance = 𝜎^2= 𝑛𝜋(1−𝜋) = 3(0.1)(0.9) = 0.27
Standard Deviation = 𝜎= √(𝑛𝜋(1−𝜋) ) = 0.52 employees
Example
A broker has a bonus scheme to encourage profitable trading. Under the rules of the scheme, any trader who drops below his daily target more than three times in a two-week period (10 working days) will forfeit his bonus at the end of the period. If the probability that an employee will be below target on any one day is 0.15, how many bonuses will be lost by 100 traders in a 50-week year? (Assumptions of independence are valid here).
a. 50
b. 100
c. 125
d. 10
P(X>3) = 1 – P(X=0)- P(X=1) - P(X=2) - P(X=3)
= 1 − 𝐶10_0∗ 0.15^0∗ 0.85^10 − 𝐶10_1∗ 0.15^1 ∗ 0.85^9 − 𝐶10_2 ∗ 0.15^2 ∗ 0.85^8 − 𝐶10_2 ∗ 0.15^3 ∗ 0.85^7
P(X>3) =0.05.
10025 = 2,500 periods to consider, giving 2500 P(X>3) = 125 lost bonuses.
What type of values can a continuous random variable assume?
Any value x in an interval on the real line or in a collection of intervals.
Can we find the probability that a continuous random variable equals a specific value?
No, the probability that it equals a specific value is 0.
How do we express probabilities for a continuous random variable?
We talk about the probability that it falls within a given interval.
How is the probability defined for a continuous random variable between two values x_1 and x_2?
It is the area under the probability density function (PDF) between x_1 and x_2
What is a probability density function (PDF)?
A function whose area under the curve between two points represents the probability of the variable falling within that interval.
What is a uniform distribution in the context of continuous variables?
A distribution where all intervals of the same length are equally likely; the PDF is flat
What is a normal distribution?
A bell-shaped, symmetric distribution defined by its mean (μ) and standard deviation (σ), with most values near the mean.
In a normal distribution, what does the area under the curve represent?
The probability that the variable falls within a specific interval.
When is a random variable said to be uniformly distributed?
When the probability is proportional to the length of the interval.
What is the uniform probability density function (PDF)?
f(x) = 1/(b-a)
= 0
for a ≤ X ≤ b elsewhere
where: a = smallest value the variable can assume
b = largest value the variable can assume
What does the graph of a uniform distribution look like?
A horizontal line between a and b, representing equal probability across the interval.
What is the expected value (mean) of a uniform distribution?
E(X) = (a+b)/2
(The average of the smallest and largest values.)
What is the variance of a uniform distribution?
Var(X) = (b-a)ˆ2 /12
In a uniform distribution, what do a and be represent again?
a: minimum possible value
b: maximum possible value
Example
Stein customers are charged for the amount of salad they take. Sampling suggests that the amount of salad taken is uniformly distributed between 140g and 420g.
Uniform Probability Density Function
f(x) = 1/280 for 140 < X < 420
= 0 elsewhere
where: X = salad plate filling weight
Expected Value of X
E(X) = (a + b)/2
= (140 + 420)/2
= 280
Variance of X
Var(X) = (b - a)2/12
= (420 – 140)2/12
= 6533.33
What is the probability that a customer will take between 336 and 420 grams of salad?
P(336 < X < 420) = 1/280(84) = 0.3
Why is the normal distribution important?
It’s the most important distribution for describing continuous random variables and is widely used in statistical inference.
What is the general shape of the normal distribution?
A bell-shaped, symmetric curve centered around the mean.
What is the formula for the normal probability density function (PDF)?
f(x) = (1/ σ√2π ) (e)ˆ-(x-μ)ˆ2/ 2σˆ2
where:
μ = mean;
σ = standard deviation
π = 3.14159
e = 2.71828
What is the skewness of the normal distribution?
The skewness is zero, meaning the distribution is perfectly symmetric.
Where is the highest point on the normal curve?
At the mean, which is also the median and the mode.
In a normal distribution, how are the mean, median, and mode related?
They are all equal and located at the center of the distribution.
What defines a specific normal distribution within the family of normal distributions?
Its mean (μ) and standard deviation (σ).
Can the mean be any numerical value?
Yes, the mean can be negative, zero, or positive depending on the data values.
What does the standard deviation determine in a normal distribution?
The standard deviation determines the width of the curve. Larger values result in wider, flatter curves, indicating more variability in the data. Smaller values result in narrower, taller curves, indicating less variability.
How are probabilities represented for a normal random variable?
Probabilities for the normal random variable are given by areas under the curve. The total area under the curve is 1, with 0.5 to the left of the mean and 0.5 to the right.
What percentage of values fall within certain standard deviations of the mean in a normal distribution?
68.26% of values are within +/- 1 standard deviation of the mean.
95.44% of values are within +/- 2 standard deviations of the mean.
99.72% of values are within +/- 3 standard deviations of the mean.
What is a standard normal probability distribution?
A random variable with a normal distribution having a mean of 0 and a standard deviation of 1 is called a standard normal probability distribution. The letter Z is used to designate the standard normal random variable.
What does the Z value represent in a standard normal distribution?
The Z value represents the number of standard deviations that a value (X) is from the mean. It is used to convert any normal distribution to the standard normal distribution.
Z = (X - μ) / σ
What is the standard normal density function?
f(x) = (1/ √2π ) (e)ˆ-(z)ˆ2 / 2
where:
z = (x – m)/s
π = 3.14159
e = 2.71828
Example
The store manager is concerned that sales are being lost due to stockouts while waiting for an order.
It has been determined that demand during replenishment lead-time is normally distributed with a mean of 60 litres and a standard deviation of 24 litres.
The manager would like to know the probability of a stockout, P(X > 80 litres).
Solving for the Stockout Probability:
Step 1: Convert x to the standard normal distribution
z = (x - μ)/ σ
= (80 - 60)/24
= 0.83
Step 2: Find the area under the standard normal curve to the left of z = 0.83.
Normal CD
Lower = -10000
Upper = 0.83
σ = 1
μ = 0
P(Z≤0.83) = 0.796730 area
P(Z>0.83) = 1- P(Z≤0.83)
= 1 - 0.7967
= 0.2033 area
Probability of a stockout, P(X>80)
Example
If the manager of Pep Zone wants the probability of a stockout to be no more than 0.05, what should the reorder point be?
Area = 0.95 left side
Area = 0.05 right side
Step 1: Find the z-value that cuts off an area of 0.05 in the right tail of the standard normal distribution.
Normal CD
Lower = -10000
Upper = 1.6 (trial and error)
σ = 1
μ = 0
P(Z≤1.64) = 0.9495 area
P(Z≤1.65) = 0.9505 area
Step 2: Convert z_0.05 to the corresponding value of x.
x = μ + z_0.05σ
= 60 + 1.645(24)
= 99.48 or 100
A reorder point of 100 litres will place the probability of a stockout during leadtime at (slightly less than) 0.05.
By raising the reorder point from 80 litres to 100 litres on hand, the probability of a stockout decreases from about 0.20 to 0.05.
When can the normal distribution be used to approximate binomial probabilities?
The normal distribution can approximate binomial probabilities when:
n>20
nπ ≥ 5
n(1-π) ≥ 5
This is useful because calculating binomial probabilities directly becomes difficult for large n
How do you apply the normal approximation to a binomial distribution?
Set μ = nπ and σ = √(𝑛π(1−π ))
Use a continuity correction by adding and subtracting 0.5 since a continuous distribution is approximating a discrete one.
For example, P(x = 40) is approximated by P(39.5 ≤ X ≤ 40.5).
Example
Shares in the investment bank Black Plainly (BP) are generally expected to mirror the performance of the market index. BP’s annualised volatility is estimated to be 10%. Suppose you believe that next year the return on the stock market as a whole will be 19.6%. Assuming that BP’s returns are normally distributed, how likely is it that Black Plainly shares will not increase in value?
P(x<= 0) = ?
z = (0-19.6%)/10% = -1.96
P(z<=-1.96) = 2.5%
The price of a security is approximately normally distributed. In the last year on about 20% of working days the price was less than or equal to 20. On 25% of working days the price was above 75. Find the mean and standard deviation of the price.
P(X<= 20) = 0.2
P(Z<= (20-µ)/σ) = 0.2 or
(20-µ)/σ) = -0.84 — (1)
P(X>75) = 0.25
P(X<=75) = 1-.25 = 0.75
P(Z<= (75-µ)/σ)) = 0.75
Or, (75-µ)/σ = 0.68 — (2)
Solve for µ and σ:
20- µ = -0.84 σ
75- µ = 0.68 σ
Therefore, σ = 36.2 and µ = 50.4
What is VaR?
Estimate of the loss from a given position over a fixed time period that will be equaled or exceeded with a given probability
What two equivalent interpretations does VaR have?
Worst Case Loss: over one day, there is a 95% probability that we will not lose more than $ yy
An unlikely event: on average, in one out of every 20 days, we should expect to incur a loss greater than or equal to a certain amount
What are the 3 key aspects of VaR?
VaR measures the minimum potential loss at the stated probability; the actual loss that could be incurred could be higher.
VaR is associated with a stated degree of probability. Lowering the probability (increasing the confidence interval) increases the VaR.
VaR measure is associated with a specific time period. Increasing the time interval will increase the VaR.
Example
We have a position worth $10 million in Microsoft shares
The volatility of Microsoft is 2% per day
The standard deviation of the change in the portfolio in 1 day is $200,000
Assuming the expected 1-day return is zero, what is the VaR at 99%?
P(x<=T) = 0.01
P(Z<= (T-0)/200,000)) = 0.01
((T-0)/200,000) = -2.33
Or, T = -2.33*200,000 = -466,000
99% of the time, the portfolio’s loss will be no more than $466,000!