continuous random variables Flashcards
continuous random variables
All random variables assign a number to
each outcome in a sample space. Whereas discrete random variables take on a discrete set
of possible values, continuous random variables have a continuous set of values.
Computationally, to go from discrete to continuous we simply replace sums by integrals. It
will help you to keep in mind that (informally) an integral is just a continuous sum.
Example 1. Since time is continuous, the amount of time Jon is early (or late) for class is
a continuous random variable. Let’s go over this example in some detail.
Suppose you measure how early Jon arrives to class each day (in units of minutes). That
is, the outcome of one trial in our experiment is a time in minutes. We’ll assume there are
random fluctuations in the exact time he shows up. Since in principle Jon could arrive, say,
3.43 minutes early, or 2.7 minutes late (corresponding to the outcome -2.7), or at any other
time, the sample space consists of all real numbers. So the random variable which gives the
outcome itself has a continuous range of possible values.
It is too cumbersome to keep writing ‘the random variable’, so in future examples we might
write: Let 𝑇 = “time in minutes that Jon is early for class on any given day.”
two views of a definite integral
what is the connection between the two views of a definite integral
range of values
which may be finite or infinite in
extent. Here are a few examples of ranges: [0, 1], [0, ∞), (−∞, ∞), [𝑎, 𝑏]
formal definition of continuous random variable
probability density function (pdf)
pmf vs pdf
The probability density function 𝑓(𝑥) of a continuous random variable is the analogue of the probability mass function 𝑝(𝑥) of a discrete random variable.
Here are two important differences:
1. Unlike 𝑝(𝑥), the pdf 𝑓(𝑥) is not a probability. You have to integrate it to get probability.
2. Since 𝑓(𝑥) is not a probability, there is no restriction that 𝑓(𝑥) be less than or equal to 1.
Note: In Property 2, we integrated over (−∞, ∞) since we did not know the range of values taken by 𝑋. Formally, this makes sense because we just define 𝑓(𝑥) to be 0 outside of the range of 𝑋. In practice, we would integrate between bounds given by the range of 𝑋.
graphical view of probability
probability mass
Why do we use the terms mass and density to describe the pmf and pdf? What is the
difference between the two? The simple answer is that these terms are completely analogous
to the mass and density you saw in physics and calculus. We’ll review this first for the
probability mass function and then discuss the probability density function.
mass as a sum
mass as an integral of density
Example 2. Suppose 𝑋 has pdf 𝑓(𝑥) = 3 on [0, 1/3] (this means 𝑓(𝑥) = 0 outside of [0, 1/3]). Graph the pdf and compute 𝑃 (0.1 ≤ 𝑋 ≤ 0.2) and 𝑃 (0.1 ≤ 𝑋 ≤ 1).
𝑃 (0.1 ≤ 𝑋 ≤ 0.2) for pdf 𝑓(𝑥) = 3 on [0, 1/3]
𝑃 (0.1 ≤ 𝑋 ≤ 1) for pdf 𝑓(𝑥) = 3 on [0, 1/3]
Notation for continuous random variable
We can define a random variable by giving its range and probability density function. For example we might say, let 𝑋 be a random variable with range [0,1] and pdf 𝑓(𝑥) = 𝑥/2.
Implicitly, this means that 𝑋 has no probability density outside of the given range. If we wanted to. be absolutely rigorous, we would say explicitly that 𝑓(𝑥) = 0 outside of [0,1], but in practice this won’t be necessary.
Example 4. Let 𝑋 be the random variable in the Example 3. Find 𝑃 (𝑋 ≤ 1/2).
In words the above questions get at the fact that the probability that a random person’s height is exactly 5’9” (to infinite precision, i.e. no rounding!) is 0.
Yet it is still possible that someone’s height is exactly 5’9”. So the answers to the thinking questions are 0, 0, and No.
cumulative distribution function (cdf)
notes on cdf (3)
- For discrete random variables, we defined the cumulative distribution function but did not have much occasion to use it. The cdf plays a far more prominent role for continuous random variables.
- As before, we started the integral at −∞ because we did not know the precise range of 𝑋. Formally, this still makes sense since 𝑓(𝑥) = 0 outside the range of 𝑋. In practice, we’ll know the range and start the integral at the start of the range.
- In practice we often say ‘𝑋 has distribution 𝐹 (𝑥)’ rather than ‘𝑋 has cumulative distribution function 𝐹 (𝑥).’
Example 5. Find the cumulative distribution function for the density in pdf 𝑓(𝑥) = 3 on [0, 1/3]
Properties of cumulative distribution functions (cdfs) (6)
algebraic proof of property 5 of cdfs
what is another name for property 6 of cdfs?
the fundamental theorem of calculus.
geometric proof of property 5 of cdfs
Probability density as a dartboard
We find it helpful to think of sampling values from a continuous random variable as throwing darts at a funny dartboard.
Consider the region underneath the graph of a pdf as a dartboard.
Divide the board into small equal size squares and suppose that when you throw a dart you are equally likely to land in any of the squares.
The probability the dart lands in a given region is the fraction of the total area under the curve taken up by the region.
Since the total area equals 1, this fraction is just the area of the region. If 𝑋 represents the 𝑥 coordinate of the dart, then the probability that the dart lands with 𝑥-coordinate between 𝑎 and 𝑏 is just