Lecture 8-Continuous random variables Flashcards
Continuous random variables
is not COUNTABLE
*A cts random variable can assume any value with an interval
*Because the number of values contained in an interval is infinite, the possible number of values that a cts random variable can assume is also infinite
*We therefore cannot count these values as we do for discrete random variables
probability distributions
links a random variable X with the probability that X assumes a discrete value or a range of values
*This can be presented by a table, function or formula
*Random variables can be discrete or continuous
*Probability distributions are also correspondingly discrete or continuous
*Strictly speaking, when a variable is continuous, Pr(X=x) = 0
*In other words, it is impossible to determine the probability associated with a PRECISE value, simply because it is impossible to determine a precise value of the continuous random variable
*It is only possible to determine the probability associated with INTERVALS on the real line, for example, Pr(X 5), or Pr(-3 X 7).
Properties of Continuous Probability Distinctions
The probability distribution of a continuous random variable possesses the following two characteristics:
–The probability that X assumes a value in any interval lies in the range of 0 to 1 (like all probabilities)
–The total probability of all the mutually exclusive intervals within which X can assume a value, is 1
*The second criterion means that the area under the curve of f(x), the probability density function, is equal to 1.
The Normal Distinction
is one of the many distributions that a cts random variable can possess
*however it is the most widely used continuous distribution
*A large number of phenomena in the real world are either exactly or approximately normally distributed
A continuous random variable X having a
probability distribution function
is said to have a Normal Distribution.
*
The Normal Curve
is the graph of the normal distribution
It is bell shaped and symmetric
– It is centred at the mean value μ
– Its tails extend indefinitely i.e. from -∞ on the left to +∞
on the right without touching or crossing the horizontal axis
We can identify certain properties of the normal distribution:
– The mean, median and mode of the distribution coincide at x = m
– The curve is symmetrical about a vertical axis through the point x =
– The total area under the curve is equal to one
* The symmetry about the mean value points to the area under the curve to the left of the mean equals 0.5; similarly, the area
under the curve to the right of the mean is also 0.5. The higher the top of the curve, the lower the std deviation and vice versa
The Parameters of Normal Distribution
μ and σ are called the parameters of the normal distribution.
*Each combination of μ and σ gives rise to a unique normal curve referred to as N(μ , σ).
* No probability can be computed without values for μ and σ.
Calculating Probabilities with the Normal Distribution
Recall that, with cts random variables and cts distributions such as the Normal Distribution, we cannot speak about X being EQUAL TO a value
*By definition, a cts random variable cannot be EQUAL to a value, but rather can assume a number of infinite values within an INTERVAL
*It is therefore only possible to determine the probability associated with INTERVALS on the real line, for example, Pr(X 5), or Pr(-3 X 7).
*These probabilities can be calculated by calculating the relevant AREA under the normal curve. The probability density function of the Normal Distribution
is given by
* In the absence of any other information, calculating
probabilities that X lies in a particular interval will require the calculation of the relevant area under the Normal Curve
Calculating Probabilities with the Normal Distribution
We don’t want to have to use the formula for the probability density function as it is very clumsy to integrate
*The areas under the Normal Curve can be presented in a cumulative probability table. If we had this information, we could then use the tables to calculate the required probabilities
*However, every Normal Curve will be different, depending on the values of the parameters μ and σ
*Therefore, there exists an infinitely large family of Normal Curves based on different combinations of μ and σ
*Does this suggest that we need to access a book containing infinitely many cumulative probability tables? NO it does not.
Standard Normal Distribution- short definition
We can adopt a practice that allows us to reduce any Normal Distribution probability into a standard metric.
Standard Normal Distribution- long definition
is the special case of the Normal Distribution where μ = 0 and σ = 1
*The random variable that possesses the Standard Normal Distribution is called the Standard Normal Variable and it is denoted by Z
*Therefore, μ =E(Z) = 0 σ= Std Dev of Z = 1, and σ2 = Var(Z) = 1
* The values of Z are located on the horizontal axis of the Standard Normal Curve.
* The Values of Z are also called Z Scores otherwise called standard scores.
Standardisation
In general, a normal distribution has a mean of μ (not necessarily equal to zero as in the standard case) and a variance of σ (not necessarily equal to 1).
*Yet the tables discussed above are valid only for that standard case where μ = 0 and σ = 1
*How then can we use the Standard Normal tables to calculate probabilities for variables that follow a Normal but NOT a Standard Normal Distribution?
*The way to do this is to “STANDARDISE”
For a random variable X following a normal distribution with mean μ and standard deviation σ, a particular value of X can be converted to its corresponding Z value by using the formula
Z = X– μ over
σ
Standardisation Example
Let X be a cts random variable that has a normal distribution with a mean of 50 and a standard deviation of 10. Convert the following X values to Z values and find the probability to the left of these points.
*(1) X = 55
*(2) X = 35
Solution
X N(50, 10)
X = 55
Z = (55-50) / 10 = 0.5
P(Z<0.5) = 1- P(Z>0.5) = 1-0.3085 = 0.6915
X = 35
Z = 35-50 / 10 = -1.50
P(Z < -1.50) = P(Z>1.5) = 0.0668
Applications of the Normal Distribution: Activity
The monthly share deposits of members of a Credit Union are normally distributed with mean $500 and standard deviation $150.
Find the probability that in any month the deposits will range between $250 and $875.
Let X represent the monthly share deposits of members
*X N(500, 150)
*We therefore need to find P(250<X<875)
*Standardizing:
*Z = 250 – 500 = - 1.66 over
150
Z = 875 – 500 = 2.5 over
150
*Now we have the two corresponding Z values hence we can use the Standard Normal Distribution and its Table
*Our resultant probability is: 1- (0.00621+0.0485) = 0.945
The Normal Approximation to the Binomial Distribution
This approximation is a special case of the very famous Central Limit Theorem (which we will meet again soon), and is both of practical and theoretical importance.
*In particular, it remains very useful notwithstanding the widespread use of electronic computers.
*We have already seen that if
X Bin(n, p),
Then E(X) = np and Var(X) = npq
*
*If N is large, we can approximate X by a Normal Distribution
Remember, we are approximating a DISCRETE distribution by a CONTINUOUS one
*Before we approximate, we must apply what is known as a Continuity Correction , to convert the discrete random variable into a continuous one.
*The continuity correction is made by subtracting 0.5 from the lower limit of the interval and/or adding 0.5 to the upper limit of the interval.
*For example, if X is a discrete random variable that follows a Binomial Probability Distribution and we are required to find Pr(X < 9), then the binomial probability Pr(X < 9) will be approximated by the normal probability Pr(X<9.5) - adding 0.5 to the upper limit (there is no lower limit).
*Similarly, Pr(X>10) will become Pr(X>9.5), and Pr(5<X<8) will become Pr(4.5<X<8.5).
The Normal Approximation to the Binomial Distribution
75% of students on the U.W.I campus are known to be female. A sample of 100 students is drawn, what is the probability that there will be more than 20 male students?
*The proportion of male students is 0.25 (the value of p). If we use the Binomial distribution, we must evaluate: Pr(X>20) = Pr(X=21) + Pr(X=22) + … + Pr(X=100)
*This is a Herculean task which should only be carried out using MINITAB or some other statistical software. Doing so yields a value of Pr(X>20) = 0.8512
*Since n =100 is a relatively large number, we may use the normal distribution to calculate the value of Pr(X>20).
The normal distribution in this case would have a mean of np = 100x0.25 =25
*and a variance of npq=100x0.25x0.75=18.75
*Since Pr(X>20)= 1 - Pr(X20), we must evaluate Pr(X20). Employing the correction factor discussed above we must evaluate Pr(X20.5) as follows:
*Pr(X20.5) = Pr(X- = Pr(Z-1.04) = 0.1492 so that, finally, Pr(X>20) = 1 - 0.1492 = 0.8508
*This value is reasonably close to that obtained using the Binomial distribution.