Probability Distributions Flashcards
If ________ is collected from an experiment it can be modelled by a _____________ ______________. The probability of each outcome determines the shape (or distribution).
data
probability distribution
What is the definition of Random Variables, X?
A variable whose outcomes (values) are determined by the result of an experiment or observation. Cannot be predicted in advance.
What notation is used in probability distributions for random variables?
Uppercase X, represents the name of a random variable (e.g X=flipping a coin). Lower case 𝑥 represents the values X can take.
What are the 3 types of random variables?
1) Descriptive
2) Discrete
3) Continuous
What is the definition of a Descriptive random variable?
Usually a word or label, e.g X=colour, breed of sheep etc (not included in this unit of work)
What is the definition of a Discrete random variable?
Usually whole numbers. E.g X=number of pets in a household. It is possible to calculate P(X=𝑥) e.g P(X=2) is the probability the number of pets in a household is 2.
What is the definition of a Continuous random variable?
Determined by a measurement e.g X=time taken to run 100m. Can take on any value on a scale. Cannot find a particular value e.g P(X=13sec), given that values are rounded and probably recorded as groups, so instead we can find e.g P(X<15 or P(10<X<12) (in seconds)
When describing distributions what 3 things do you need to include?
1) The highest & lowest values
2) The mode (most common/peak)
3) The shape
What are Parameters?
Data calculations e.g median, mean etc
What is the Mean?
Average
What is Standard Deviation?
Spread
What is the full definition of the Mean?
The expected value of a random variable (It is kind of theoretical AVERAGE of the values).
Expected value=E(X)=mean=μ (greek letter “mew”)
What is the notation for the Mean?
- Expected value E(X)
- μ (greek letter “mew”)
- Σ𝑥xP(X=𝑥)
Find K:
________________________
|P(X=𝑥) | 0.1 | K | 0.2 | 0.1 | 0.3 | (these should all add to 1)
𝑥 | 1 | 2 | 3 | 4 | 5 |
0.1 + 0.2 + 0.1 + 0.3 = 0.7
1 - 0.7 = 0.3
K = 0.3
Find E(X) (The expected value of X):
________________________
|P(X=𝑥) | 0.1 | K | 0.2 | 0.1 | 0.3 |
_________________________
(these should all add to 1)
E(X)=(1x0.1)+(2x0.3)+(3x0.2)+(4x0.1)+(5x0.3)=3.2
E(X)=3.2
What is Standard Deviation and Variance?
Measures of spread
What is the symbol and notation for Standard Deviation?
σ (sigma)
SD(X)
What are the 3 equations used for standard deviation?
σ = SD(X)
=√∑(𝑥-μ)² x P(X=𝑥)
=√E(X²) - [(E(X)]²
What is the definition of Standard deviation?
Standard deviation tells us how spread the data is from the mean. (The greater the spread, the higher the SD(X)), from formula sheet.
What do you need to keep in mind for Variance?
Variance is found by squaring the standard deviation. It is used purely for it’s link to the standard deviation.
What is the symbol and notation for Variation?
VAR(X)
σ²
What are the steps for using the Graphics Calculator to find E(X) and SD(X)?
Go to STAT menu
To clear table: F6 then F4 (Above Del A) then F1
To enter n values into table List 1 (0 then EXE then 1 then EXE…) Then enter into List 2
F2 to select Calc
F6 to select Set
1 Var X list: List 1
1 Var Freq: List 2
EXE
F1 = 1 Var (calculation answers)
Expectation Algebra: What happens if all values are doubled?
_________________________
y | 2 | 4 | 6
_________________________
P(Y=y) | 3/8 | 1/8 | 4/8 |
_________________________
new E(Y) = 2 x 2.125 (2 x E(X)) = 4.25
new VAR (Y) = 2² x 0.859 (2² x VAR (X))
new SD (Y) = √VAR (Y) = 1.854
For combination events what are the 2 equations?
1) e (a X + b Y) = a E(X) + b E(Y)
2) Var (a X + b Y) = a² Var(X) + b² Var(Y)
For independent events X and Y are where?
on formula sheet
Question:
If E(X) = $4.20
Var(X) = $0.80
E(Y) = $3.50
Var(Y) = $1.20
& X and Y are independent find:
a) E (X + Y)
b) E (2X + 7Y + 1)
c) Var (2X + 7Y + 1
d) SD (3X - 2Y - 5)
a) E (X + Y) = $4.20 + $3.50 = $7.70
b) E (2X + 7Y + 1) = 2 x 4.20 + 7 x 3.50 + 1 = $33.90
c) Var (2X + 7Y + 1) = 2² x 0.80 + 7² x 1.20 = $62
d) SD (3X - 2Y - 5) =
Step 1 find Var: (3X - 2Y - 5) = 3² x 0.8 - 2² x 1.2 = 2.4
Step 2 use: SD (X) = √Var(X)
SD (3X - 2Y - 5) = √2.4 = 1.55 (2dp)
Question:
E(X) = 1.78 and E(Y) = 2.1
If E(X + Y) = 3.24 are X and Y independent?
If events are independent E(X + Y) = 1.78 + 2.1 = 3.88
However E (X + Y) = 3.24
3.24 ≠ 3.88 therefore events are NOT independent.
What are the 2 types of data used in distribution models?
Discrete and Continuous
What are the 2 types of Discrete data?
1) Poisson
2) Binomial
What are the 3 types of Continuous data?
1) Triangular
2) Rectangular
3) Normal
What is Rectangular Distribution (a.k.a uniform)>
It is for continuous data. The probability is the same/similar for all intervals. Minimum value & maximum value is given (note: no mode is given).
What is ƒ(𝑥)
Probability Density Function
What are the 2 parameters for Rectangular Distribution?
a = minimum value
b = maximum value
What is the formula for rectangular distribution?
ƒ(𝑥) { 1/b-a for a≤𝑥≤b
0 elsewhere
Question:
What is the probability a person waits for fewer than 12.5 mins on a Monday? (fewer than 12.5 mins = between 0 and 12.5 min)
Note:
a=min time=0
b=max time=20
c=lower bound of desired interval=0
d=upper bound of desired interval=12.5
Formula to find probability = d-c / b-a for rectangular
P(X<12.5) = 12.5 - 0 / 20 - 0 = 0.625
What is Triangular Distribution?
For continuous data. Min and Max AND a Mode is given. Parameters:
a=lowest value
b=highest value
c=mode (peak)
How do you find peak height in Triangular Distribution?
2 / b-a = h
Where are the equations for triangular distribution?
On the formula sheet (too hard to put on here)
What is Normal Distribution?
Used for continuous data. No UPPER OR LOWER limit to the values it can take. The distribution is BELL-SHAPED and SYMMETRICAL
What are the parameters for Normal Distribution?
μ = Mean (middle-cos of symmetry)
σ = standard deviation (spread)
Ƶ = the number of standard deviations a value “𝑥” is from the mean (to the left or right).
What are Ƶ values?
Our 𝑥 values are not always exactly 1,2 or 3σ from the mean. Ƶ VALUES tell us HOW MANY STANDARD DEVIATIONS an 𝑥 value is from the mean.
What is the formula for calculating a Ƶ value?
Ƶ = 𝑥 - μ / σ
What are the steps to finding the Ƶ value?
1) Plug numbers into formula
2) calculate
3) round to (3dp)
4) Look up the number in the table on formula sheet
5) That’s your answer
What is Inverse Normal Distribution?
Normal Distribution in reverse.
Recall: Ƶ = 𝑥 - μ / σ
In these questions you will be given 2 of 𝑥, μ or σ as well as a probability from which you can use the normal distribution tables in reverse to get the Ƶ value. You will then use the above equation to work out the missing value.
What are the 4 steps for Inverse Normal Distribution?
1) Draw graph.
2) Look up the probability in the middle of the table and work back to the Ƶ value.
3) Calculate missing value using Ƶ = 𝑥 - μ / σ
4) Write in context
What is Poisson Distribution (Discrete data)?
It is for rare events, therefore take on shape of skew to the right. For discrete data collected in a “finite continuous space” i.e a given time period, or a specified area etc. E.g number of shark attacks at a beach in one year, or number of tornadoes in a district in a season.
What is the parameter for Poisson Distribution?
(lambda) λ = the average number of occurrences in a finite continuous space.
λ = variance also
(Therefore √λ = standard deviation).
What are values of λ estimated from?
Real life data. E.g λ = 2 shark attacks a year.
What are the steps for Poisson Distribution in the Graphics Calculator?
Menu then Stat then F5 (dist) then F6 (▻) then F1 (Poisn) then F1 (Ppd) or F2 (Pcd)
What is Ppd?
just one value
What is Pcd
More than one value
Question:
Quitline receives on average 8 calls per hour. What is the probability that it receives exactly 3 calls per hour? Hint use Ppd
Data : Variable
λ : 8
𝑥 : 3
P(X=3) = 0.0286
Question:
Quitline receives on average 8 calls per hour. What is the probability that it receives less than 5 calls per hour? Hint use Pcd
λ : 8
𝑥 : 4
P(X<5) = 0.0996 (4dp)
X<5 is the same as X≤4
Question:
Quitline receives on average 8 calls per hour. What is the probability that it receives at least 6 calls per hour? Hint use Pcd
λ : 8
at least 6: 6, 7, 8, 9………..
𝑥 : 5 (G.C works out less than or equal to 𝑥)
P(X≥6) = 1 - 0.1912 (4dp) = 0.8088 (4dp)
What symbol does a graphics calculator use to depict standard form?
G.C uses:
2.0006ᴇ⁻³
Instead of:
2.0006 x10⁻³
What are the 4 different conditions required for Poisson?
1) Each occurrence is independent (doesn’t influence) of others.
2) Events must not occur simultaneously (not for events over a long period).
3) Events must occur randomly and unpredictably.
4) For a small interval the probability of an event occurring is proportional to the size of the interval.
What is the formula you use for an Inverse Poisson Problem?
To find λ, need P(X=0)
Then use formula:
λ = ln x P(X=0)
ln means natural log
What is Binomial Distribution?
For discrete data. There are only 2 outcomes, we refer to these as success or failure.
E.g effectiveness of a drug → effective or not
Rolling a dice and getting a 6 → getting a 6 or not
What are the 4 conditions for Binomial Distribution?
1) There is a fixed number of identical trials
2) the trials are independent
3) there are only 2 possible outcomes
4) the probability of success on any trial is constant (doesn’t change)
What are the parameters for Binomial Distribution?
π or p = probability of success (i.e favourable outcome occurring)
n = number of trials
μ = mean = np (or nπ)
on formula sheet its μ=nπ
σ = standard deviation = √np(1-p) or √nπ(1-π) ← that ones what’s on formula sheet
Therefore variance = np(1-p)
Binomial Distribution graphs points to remember:
- Unimodal (one peak)
- Could be skewed to the right or left or symmetrical
- Fixed number of trials
- 2 outcomes
- Independent trials
- Probability success=constant
Steps for using Binomial Distribution on graphics calculator
Menu then Stat then F5 (dist) then F5 (Binm) then F1 (Bpd) or F2 (Bcd)
𝑥 = number of interest
numtrial = how many in the trial, n
p = probability success of 1 trial
What do you need to know for Inverse Binomial Distribution?
Must know P(X=0) (can get from P(𝑥≥ 1) if needed) or P(X=n) e.g if 30 trials then need P(X=30).
You need either n, number of trials/items or P/π, the probability to find the other.