Continuous probability distributions Flashcards
difference between discrete and continuous data
Remember: discrete data has a countable number of possible values –>discrete probability distributions can be put in tables
Continuous data have an infinite number of possible values –>we use a smooth function, f(x) to describe the probabilities
what is a good way to represent continuous data?
via histograms
how do observations influence the shape of the histogram?
more observations result to a smoother histogram
how to represent a continuous random variable?
continuous random variable is a function such that the probability that the variable lies in an interval (a, b) is the area under the curve from a to b

what are features of a probability density function?
f(x) must satisfy the following:
- f(x)≥0 for all x, that is, it must be non-negative.
- The total area underneath the curve representing f(x) = 1.
what do areas on the histogram represent?
In each case above, the % areas of the histogram boxes (that is, the area of a box as a % of the total area of all the boxes) are providing estimates of the probabilities of intervals.
what are possible shapes of f(x)?

how to evaluate area under curve?

for continuous probability density function, when x takes on a specific value eg. x=2, what is the area?
area = 0
describe the mean and variance of a continuous random variable
The mean measures the location of the distribution, the variance measures the spread of the distribution.
Find P(X>0.5)

¼
Find P(X<0.75)

1-P(X>0.75)=1-(½*¼*½)=15/16
what is uniform distribution?
Special sort of distribution for continuous data.
Described by the function
f(x) = 1/b-a a ≤ x ≤ b

what is expectation and variance of uniform distribution?

The length of time patients wait to see a doctor is uniformly distributed between 40 minutes and 3 hours.
Let X be the waiting time in minutes.
as f(x) = 1/b-a
f(x) = 1/180-40
f(x)=1/140
40 less than or equal to X 180 equal to or greater than

The length of time patients wait to see a doctor is uniformly distributed between 40 minutes and 3 hours.
Let X be the waiting time in minutes.
Find the probability of waiting between 50 minutes and 2 hours.
P(50≤x≤120) = (120-50) x 1/140 = 0.5

The length of time patients wait to see a doctor is uniformly distributed between 40 minutes and 3 hours.
Let X be the waiting time in minutes.
Find the mean and variance of the distribution of waiting times

The length of time patients wait to see a doctor is uniformly distributed between 40 minutes and 3 hours.
Let X be the waiting time in minutes.
Find the probability of having to wait exactly one hour.
P(X=60) = 0

what is another special sort of distribution for continuous data?
The Normal Distribution
The general form of the pdf (probability density function) is given by:

describe normal distribution
Bell-shaped, symmetric about µ, reaches highest point at x=µ, tends to zero as x→±∞.

<!--StartFragment-->
About the Normal Distribution<!--EndFragment-->
- E(X) = µ; V(X) = σ².
- Area under curve = 1
- Different means – shift curve up and down x-axis
- Different variances – curve becomes more peaked or more squashed
- Shorthand notation: X~N(µ, σ²).
what do different means on a normal distribution look like?

what do different variances on normal distribution look like?

what is the standard normal distributoin?
when μ=0, σ² =1
P(Z<-1.08)
3) P(Z<-1.08) = P(Z>1.08) by symmetry
= 0.1401 (from (2))
OR P(Z<-1.08) = 0.1401 from tables directly.

P(-1.51
= P(Z
= 0.8599 – 0.0655
= 0.7944

what do you call a random variable from a standard normal distribution?
Call a r.v. from this a Standard Normal r.v., use notation Z~N(0,1)
how do you convert any Normal random variable to a Standard Normal Random Variable?
If X~N(μ,σ²), then use the linear transformation below:
the standardised value is known as the Z score
So, for ANY random variable that comes from a normal distribution, if we subtract the mean and divide by the standard deviation, we get a r.v.~N(0,1).

Find P(Z>1.08)
= 1 – P(Z<1.08)
= 1 – 0.8599 (from tables)
= 0.1401
Suppose X~N(50, 100²). Find P(X<200)

The length of metallic strips produced by a machine are normally distributed with a mean of 100cm and a variance of 2.25cm². Only strips that are between 98 and 103cm are acceptable. What proportion of strips are acceptable?

A battery lasts an average of 3 years with a standard deviation of 0.5 years. Assuming that battery lives are normally distributed, find the probability that a given battery will last less than 2.3 years.

Salaries of workers in a factory are normally distributed with mean $48,000 and standard deviation $3,500. What is the minimum salary of the top 20% of workers?
From tables, P(Z
So,
P(Z> (a-48000/3500) ) = 0.2
P(Z<0.84) = 0.8 (need to find area below)
P(Z>0.84) = 0.2 (Area above)
Therefore a-48000/3500 = 0.84
a= 50940

features of continuous probability distributions
- f(x)≥0 for all x
- Area under curve = 1
describe symmetry found in Z tables
Symmetry → P(Z<-a) = P(Z>a)
- The time of failure for a continuous operation monitoring
device of air quality has a uniform distribution over a 24
hour day
If the device has a self-checking computer chip that
determines whether the device is operational every
hour on the hour, what is the probability that it will
take at least 40 minutes to detect that a failure has
occurred?
P(Failure takes longer than 40 minutes to detect) = P(failure occurs between 12-12.20am; or 1-1.20am; or 2-2.20am …) Hence P(failure takes longer than 40 minutes to detect = 24 \* 1/24 \* 20/60 = 1/3
- The time of failure for a continuous operation monitoring
device of air quality has a uniform distribution over a 24
hour day
If the device has a self-checking computer chip that
determines whether the device is operational every
hour on the hour, what is the probability that a failure
will be detected within 10 minutes of its occurrence?
P(failure detected within 10 minutes) = P(failure occurs between
12.50-1am; or 1.50-2am; or 2.50-3am ……)
There are 24 windows each 10 minutes wide during which the
failure could occur to be detected within 10 minutes. Hence P(failure detected within 10 minutes) = 24 * 1/24 * 10/60
= 1/6
d. If the devi
The time of failure for a continuous operation monitoring
device of air quality has a uniform distribution over a 24
hour day.
a. If a failure occurs on a day when it is daylight between
5.55am and 7.38pm, what is the probability that the
failure will occur during daylight hours?
Let X be the time of failure, past midnight, in hours.
X has a uniform distribution, with positive probability from a
lower bound of 0 to an upper bound of 24.
f(x) = 1/24, 0<x><br></br>P(failure occurs in daylight) = P(failure is between 5.55am and<br></br>7.38am)<br></br>This probability is calculated by finding the area of the rectangle<br></br>with height 1/24 and base length (5.55am to 7.38pm) = 13 hours<br></br>and 43 minutes = 13 43/60 hours.<br></br>Hence P(failure in daylight) = 1/24 * (13 43/60) = 823/1440</x>