Lecture Five Flashcards
What is discrete probability
When there are a finite number of results
An example of a discrete probability
Flipping a coin: 2 outcomes heads or tails
What is an example of continuous probability
Normal probability
What can be calculated once a probability distribution is constructed
a subset of the real number space
Why is it difficult to identify individual outcomes in the real world
Bc the sample is complex and large
Why are there many types of probability distributions and the result of this
for various types of random variables as analytic functions
=we are able to calculate probabilities for events of interest without construction the probability distribution ourselves
What is the pdf for a random variable
Probability distribution function describes how probabilities are distributed over the values of random variables
-tells us the probability of a particular value coming out
What is pdf also known as
Probability mass function
What is the cdf for a random variable
The Cumulative distribution function of a random variable is a way of describing how probabilities are distributed over the values of a random variable
-tells us how likley the value is to be less then some other value
What would cdf be
a sum over a range
-sum of pdf = cdf
What is the Bernoulli model a building block for
binomial distribution
What does the Bernoulli model answer
The question if it is a success or a fail
What type of data does the Bernoulli model take
discrete
-takes one of two outcomes to determine if its a fail or success - no inbetween
What must the sum of the Bernoulli model be
The two probabilities must be equal to 1
What would the value of success be and the value of failure
Success = 1
P(1) = p
Failure = 0
P(0) = 1-p
-want to find out what P is meaning P(X) - we want to find x
How to compute the mean of the Bernoulli model
E(X) = p
x(when failure,0) = 1-p + 1(p) = p
How to compute the variance for the Bernoulli model
Probability of success * probability of failure
p*1-p
What is the Bernoulli model based on
One specific trial
How to work out the mean and variance if the probability of success is 0.4
Mean is 0.4
-because mean = p
The variance: 1-0.4 = 0.6
0.6 x 0.4 = 0.24
v = 0.24
-because variance = p*(p-1)
What is the binomial distribution?
important generalization of the Bernoulli distribution
What are the properties of the binominal distribution?
1- the experiment consists of a sequence of n identical trials (more then 1 trial/round)
2- it has 2 outcomes: success/failure for each trial
3- The probability of success is denotated by p. it doesnt change from each trial
4- The trials are independent of each other
What are we interested in for the binomial distribution
The number of successes occurring in the n trials
What is denoted for the number of sucesses
X
Formula of binominal distribution
=nCx x px (1-p)n-x
n: The number of experiments
p: The probability of success in a single experiment
q: The probability of failure in a single experiment, which is equal to 1 - p
nCx: The combination of n and x
Mean of the binominal distribution
Probability * number of trials
Variance of binomial distribution
p*p-1 adjusted to number of trials
What is the formula of the binominal distribution using combinations
(n!/(n-x)!x!)) * p^x(1-p)^n-x
(n!/(n-x)!x!)) : refers to the number of experimental outcomes providing exactly x successes in n trials
p^x(1-p)^n-x : is the probability of a particular sequence of trial outcomes with x successes in n trials
What does the binominal distribution formula for combinations do?
Will get u all the ways u can get a success
-all the combos to get successes are computed here
-then this probability is adjusted for the number of trials and number of successes
How to compute the binominal formula on excel and explain each element
=BINO.DIST(Number_S, Trials, Probability_s, Cumlative)
-Number_s = the number of success in trials the x in formula)
-Trials = the number of independent trials (n in the formula)
-Probability_s = the probability of success
-Cumulative is the logical value that determines the form of the function:
-true = Bino.dist returns the cumulative distribution function(cdf)
-false = Bino.dist returns the probability distribution formula (pdf)
How to work out the mean of the Binomial Probability Distribution
E(X) = np
=the probability x number of trials
-the mean of Bernoulli (which is also equal to p) times the trials
How to work out the variance of the Binomial Probability Distribution
mean * 1-p
=np*(1-p)
How to work out the standard deviation of the binominal distribution
square root of np*(1-p)
-square root of the variance
How does the variance link to the success
smaller variance = closer to success
larger variance = closer to failure
When do u use the binomial distribution
-when there are two outcomes: yes/no, true/false etc
What to check before computing the binomial distribution
1- application has many trials that only have to outcomes
2- probability is the same for each trial
3- the probability of one trial doesn’t affect the probability of other trials
What do u do if u want to find less then a certain value of probabilities for the binominal distribution on excel
want to work out a e.g. P(less then 2)
-then that means P(x<2) ALSO means P(x≤1)
-because binomial is discrete: cant take values that are decimal
so:
=BINOM.DIST(1,n,p,TRUE)
-1 because u want to find stuff less then or equal to 1
-want 0,1
What do u do if u want to find more then a certain value of probabilities for the binominal distribution on excel
-want to work out e.g. p(more then 4)
-then means P(x>4) MEANS 1-P(x≤4)
-takes away the smaller numbers before 4 and
So:
=1-BINOM.DIST(4,n,p,TRUE)
-minus bc u dont want the values before 4. only above it
-want 4,5,6,etc
What do u do if u want to find at least certain value of probabilities for the binominal distribution on excel
-want to work out P(at least 3)
-want nothing less then 3
-P(x ≥ 3) ALSO MEANS 1-P(x≤2)
So:
=1-BINOM.DIST(2,n,p,TRUE)
because want to work out nothing less then 3, want 3,4,5 etc
what random variable numbers are in continuous distribution
take any possible specific value inside a range
are there limited numbers in continuous
can create infinity numbers, e.g. between 0-1 u can have 0.1,0.2,0.3 but also 0.11,0.12 etc
What is important when working out the continuous probability distribution
compute probability using continuous random variables - not caring about the specific values, only the range of values
if random variable x is between 2 numbers
-and the probability will be that the x is between these two numbers
what does CDF and probability express
expresses that X does not exceed the value of x :
F(x) = P(X bigger/equal to x)
-it takes values between ranges
Cumulative x takes a range of values contained in this
How to find the probability that continuous random variables fall into a specific range
-need to find difference between the CDF at the upper end of the range and the CDF at the lower end of the range
P(a< X < b)
=
F(b)- F(a)
Why do we subtract upper from lower in trying to find the probability range using cdf
because cumulative is always increasing L->R
-if b>a then the cumulative function F(b) has to be > then the cumulative function of F(a)
=because b contains the probability of a that was in the past
-has to be at least that value OR larger
Explain in graph how we would find the specific range
the upper side - the end side = the middle area
probability of random variable assuming a value is within some range
=pdf (probability density function) the curve under
What are the two common continuous distribution
Uniform and Normal
What is Uniform distribution
it will give the same probability to every single observation inside the range
when is uniform distribution used
When probabilities of outcomes in the same sample space are the same
What is the pdf of uniform distribution
1/b-a
upper bound - lower bound / 1
what is the mean of pdf uniform distribution
a+b / 2
-difference / 2
What is the variance of the pdf uniform
squared difference f the upper and lower bound/12
(b-a)^2 / 12
What to do if want to find a new range of values for the pdf and uniform distribution
new range has to be inside of the old range
d-c/b-a
What is normal distribution
most important
-describes continuous disitribtion
What is normal distribution used for
Approximating bionominal distribution
-everything is used by modelling normal distribution e.g dna test/weight
What shape is the graph
bell
What are the mean reasons for normal distribution wide application
1- closely approximates the probability distributions of a WIDE RANGE of random variables
2- Distributions of sample means approach a normal distirbution GIVEN LARGE sample size
3- computation of probabilities are direct and elegant
4- Lead to good business decisions due to the number of applications