Chapter 8 Discrete Probability Distributions Flashcards
What is the inverse of PDF called? What is its input and output? P 70
The inverse of the CDF is called the percentage-point function (PPF) and will give the discrete outcome that is less than or equal to a probability.
What is a Bernoulli trial? Give an example P 71
A Bernoulli trial is an experiment or case where the outcome follows a Bernoulli distribution. The 🧨single🧨 flip of a coin that may have a heads (0) or a tails (1) outcome.
What’s a Bernoulli process? P 71
The repetition of multiple independent Bernoulli trials is called a Bernoulli process. As such, the Bernoulli distribution would be a Binomial distribution with a single trial.
How can we simulate the Bernoulli process in python? P 71
We can simulate the Bernoulli process with randomly generated cases and count the number of successes over the given number of trials. This can be achieved via the 🧨binomial() NumPy function🧨. This function takes the total number of trials and probability of success as arguments and returns the number of successful outcomes across the trials for one simulation.
👩💻
# example of simulating a binomial process and counting success
from numpy.random import binomial
# define the parameters of the distribution
p = 0.3 k = 100
# run a single simulation
success = binomial(k, p)
print(‘Total Success: %d’ % success)
👩💻
How can we calculate moments of a Bernoulli process? P 72
We can calculate the moments of this distribution, specifically the expected value or mean and the variance using the binom.stats() SciPy function.
👩💻
# calculate moments of a binomial distribution
from scipy.stats import binom
# define the parameters of the distribution
p = 0.3 k = 100 # calculate moments
mean, var, _, _ = binom.stats(k, p, moments=’mvsk’)
print(‘Mean=%.3f, Variance=%.3f’ % (mean, var))
Mean=30.000, Variance=21.000
👩💻
Mahsa 👧🏻: expected value for binomial variables is the expected number of times of success, out of multiple trials, which results to: np (n= #trials, p= probability of success)
https://www.statisticshowto.com/probability-and-statistics/expected-value/#binomial
What is Multinoulli distribution? Give an example. P 74
It is a generalization of the Bernoulli distribution from a binary variable to a categorical variable.
A single roll of a die that will have an outcome in {1, 2, 3, 4, 5, 6}, e.g. K = 6.
What is a common example of Multinoulli distribution in machine learning? P 74
A common example of a Multinoulli distribution in machine learning might be a multiclass classification of a single example into one of K classes, e.g. one of three different species of the iris flower.
What is a Multinomial distribution? Give an example P 74
The repetition of multiple independent Multinoulli trials will follow a multinomial distribution.
The multinomial distribution is a generalization of the binomial distribution for a discrete variable with K outcomes
An example of a multinomial process includes a sequence of independent
dice rolls.
What is an example of using Multinominal distribution in machine learning? P 74
A common example of the multinomial distribution is the occurrence counts of words in a text document, from the field of natural language processing.
How can we simulate a Multinominal distribution in python? P 74
we can use the multinomial() NumPy function to simulate 100 independent trials and summarize the number of times that the event resulted in each of the given categories.
What is the use of multinomial.pmf() SciPy function? P 75
Using multinomial.pmf() we can calculate the probability of the occurrence of a sequence of certain numbers. We might expect the idealized case of 100 trials to result in 33, 33, and 34 cases for events 1, 2 and 3 respectively. We can calculate the probability of this specific combination occurring in practice using the probability mass function or multinomial.pmf() SciPy function.
How can we calculate the pmf of [33,33,34] number of outcomes for each event in a multinomial distribution in 100 trials using python? What can we conclude from it? P 75
The complete example is listed below.
# calculate the probability for a given number of events of each type
from scipy.stats import multinomial
# define the parameters of the distribution
p = [1.0/3.0, 1.0/3.0, 1.0/3.0]
k = 100
# define the distribution
dist = multinomial(k, p)
# define a specific number of outcomes from 100 trials
cases = [33, 33, 34]
# calculate the probability for the case
pr = dist.pmf(cases)
# print as a percentage
print(‘Case=%s, Probability: %.3f%%’ % (cases, pr*100))
Output: Case=[33, 33, 34], Probability: 0.813%
Running the example reports the probability of less than 1% for the idealized number of cases of [33, 33, 34] for each event type.