2. Probability Flashcards
Discrete random variables
- discrete random variable
- probability mass function
- state space
p. 28
A discrete random variable can take on any value from a finite or countably infinite state space.
Fundamental rules
- probability of a union of two events
- sum rule
- product rule
- chain rule
- conditional probability
p. 29
Bayes’ rule
- generative vs discriminative classifier
p. 29
Generative classifier specifies how to generate the data using the class-conditional density p(x|y) and the class prior p(y). Discriminative classifier directly fits the class posterior p(y|x).
Independence and conditional indepencence
- unconditional (marginal) independence
- conditional indepencence
p. 31
Continuous random variables
- cumulative distribution function
- probability density function
- p(a less X lessequal b) in terms of cdf and pdf
p. 32
- quantile
- quartile
- tail area probabilities
p. 33
Mean and variance
- E[X]
- var[X]
- E[X^2]
- std[X]
p. 34
The binomial and Bernoulli distributions
- pmf, mean, var
- binomial coefficient
p. 34
The multinomial and multinoulli distributions
- pmf
- dummy and one-hot encoding
p. 35
The Poisson distribution
p. 37
The empirical distribution
- empirical distribution def
- Dirac measure
p. 37
Gaussian (normal) distribution
- pdf, mean, mode, var
- precission
- cdf
- error function
- cdf in terms of error function
p. 38
Degenerate pdf
- Dirac delta function
- sifting property
p. 39
The Student’s t distribution
- pdf, mean, mode, var
- Cauchy (Lorentz) distribution
p. 39
https: //en.wikipedia.org/wiki/Student%27s_t-distribution#Non-standardized_Student.27s_t-distribution
The Laplace distribution
- pdf, mean, mode, var
p. 41
The gamma distribution
- pdf, mean, mode, var
- gamma function
- exponential distribution
- Erlang distribution
- Chi-squared distribution
- inverse gamma distribution (pdf, mean, mode, var)
p. 41
The beta distribution
- pdf, mean, mode, var
- beta function
p. 43
Pareto distribution
- pdf, mean, mode, var
p. 43
Covariance and correlation
- covariance and covariance matrix
- correlation coefficient and correlation matrix
p. 45
The multivariate Gaussian
- precision (concentration matrix)
- number of covariance matrix parameters (full, diagonal, isotropic)
p. 46
Multivariate Student t distribution
- pdf, mean, mode
- scale matrix
p. 47
Dirichlet distribution
- pdf, mean, mode, var
- symmetric Dirichlet prior
p. 49
Linear transformations
- definition of linear transformation
- expected value
- variance
- when only first two moments suffice to completely define the transformed distribution?
p. 49
General transformations
- change of variables formula
- Jacobian matrix def
- what does the |det J| measure?
- pdf of the transformed variables using the Jacobian (eq. 2.89)
p. 50
Central limit theorem
p. 52
Monte Carlo approximation
- Monte Carlo integration formula (eq. 2.98)
- mean, variance
p. 53
Accuracy of Monte Carlo approximation
- standard error
p. 55
- definition of information entropy for a discrete variable
- which discrete distribution has the highest entropy?
- binary entropy function
p. 57
KL divergence
- what does the Kullback-Leibler divergence (KL divergence) or relative entropy measure?
- definition of the KL divergence
- definition of the KL divergence in terms of entropy
- definition of the cross entropy
- Jensen’s inequality
- proof of information inequality theorem
- principle of insufficient reason (also principle of indifference)
p. 58
Mutual information
- definition of the mutual information (MI)
- definition of the MI in terms of joint and conditional entropies
- definition of the conditional entropy
- definition of the pointwise mutual information (PMI)
- what PMI measures?
- what’s maximal information coefficient (MIC)?
p. 59