lecture 1 - normal distribution Flashcards by Kiara Shivani

What is precision (β) in the context of normal distributions?

inverse of the variance
1/(σ^2)
represents how concentrated/sharp the distribution is

How well did you know this?

Not at all

Perfectly

How does precision change as variance increases or decreases?

When variance (σ^2) increases, precision (β) decreases.
As variance approaches infinity, precision approaches zero.
When variance decreases, precision increases, resulting in a sharper peak.

How well did you know this?

Not at all

Perfectly

Why is precision (β) used instead of variance (σ^2) in some cases?

For brevity and convenience
In Bayesian statistics and machine learning, expressing inverse variance directly simplifies equations.

How well did you know this?

Not at all

Perfectly

How are precision and variance related?

Precision is the reciprocal of variance
σ^2 = 1/β
β = 1/(σ^2)

How well did you know this?

Not at all

Perfectly

What does the Probability Density Function (PDF) of a univariate normal distribution tell us?

how likely a given value of x is, centered at the mean μ and spreading out based on the variance σ^2.
The height of the distribution curve at a given value of x

How well did you know this?

Not at all

Perfectly

What does (x−μ)^2 represent in the normal distribution?

The squared distance from the mean.
Values further from the mean have lower probabilities, creating a tapering effect on both sides of the curve.

How well did you know this?

Not at all

Perfectly

What is the expected value of a univariate normal distribution?

The expected value of x (first moment)
E[x] = μ

How well did you know this?

Not at all

Perfectly

What is the expected value of x^2 in a univariate normal distribution?

The expected value of x^2 (second moment)
E[x^2] = μ^2 + σ^2
So it’s slightly higher than μ^2 as it accounts for the variance as well

How well did you know this?

Not at all

Perfectly

How is variance derived from the moments of the normal distribution?

Variance is the second moment minus the square of the first moment
var(x) = E[x^2] - (E[x])^2 = σ^2

How well did you know this?

Not at all

Perfectly

What are the properties of the normal distribution PDF?

The PDF is always positive (N(x|μ,σ^2) > 0)
the total area under the curve is 1

How well did you know this?

Not at all

Perfectly

What is a multivariate normal distribution?

A multivariate normal distribution is a generalization of the univariate normal distribution to higher dimensions.
It describes the distribution of a vector of variables.

How well did you know this?

Not at all

Perfectly

What are the key parameters of a multivariate normal distribution?

μ: A D-dimensional mean vector.
Σ: A D×D covariance matrix (symmetric and positive definite).
D: The dimensionality of the data.
∣Σ∣: The determinant of the covariance matrix.
x: a vector of variables

How well did you know this?

Not at all

Perfectly

What does (x−μ)^T Σ^−1 (x−μ) represent in the multivariate normal distribution?

It represents the “distance” of the vector x from the mean μ, scaled by the spread and orientation of the distribution.
This term measures how likely it is to observe x given the distribution.

How well did you know this?

Not at all

Perfectly

What happens when the covariance matrix of a multivariate distribution is an identity matrix?

When the covariance matrix is an identity matrix, the variables are independent and not correlated.
The distribution is spherical, with equal spread in all directions.

How well did you know this?

Not at all

Perfectly

What changes about the multivariate distribution when covariances are introduced?

When covariances are introduced, relationships between variables emerge.
The distribution becomes elliptical, indicating correlations between variables.

How well did you know this?

Not at all

Perfectly

What are the properties of marginal and conditional distributions for jointly Gaussian variables?

If two sets of variables are jointly Gaussian, the conditional distribution of one set given the other is also Gaussian.
Similarly, the marginal distribution of either set is Gaussian.

What is the precision matrix (Λ)?

The precision matrix (Λ) is the inverse of the covariance matrix (Σ): Λ=Σ^−1
It represents the concentration of the multivariate normal distribution.

What are the properties of the precision matrix (Λ)?

Symmetric: The precision matrix is always symmetric.
Inverse of the covariance matrix: It provides information about how tightly the variables are distributed around the mean.

How are the covariance matrix and precision matrix related?

The covariance matrix describes the spread of the data, while the precision matrix (its inverse) describes how concentrated the distribution is around the mean.

What is the Mahalanobis distance (Δ) in the Gaussian distribution?

The Mahalanobis distance (Δ) measures the distance from a point x to the mean
μ, taking into account the spread and orientation of the data described by the covariance matrix Σ: Δ^2
= (x−μ) ^T Σ^−1 (x−μ)
It reduces to the Euclidean distance when Σ is the identity matrix.

What is “completing the square” in the context of Gaussian distributions?

Completing the square is a method used to rewrite a quadratic form in the exponent of a Gaussian distribution to identify the corresponding mean (μ) and covariance (Σ).

What is the general form of the exponent in a multivariate Gaussian distribution?

(-1/2) (x−μ)^T Σ−1 (x−μ)

How can the mean and covariance be derived by completing the square?

The second-order term in x corresponds to Σ^−1, allowing the covariance matrix to be identified as
Σ (inverting the matrix)
The coefficient of the linear term in x corresponds to
Σ^−1μ, from which the mean μ can be obtained.

What does the term “const” refer to in the completed square expression?

Terms that are independent of x, which do not affect the form of the Gaussian distribution but may influence normalization factors.

Why is completing the square useful in machine learning?

Completing the square is useful for **deriving the parameters of Gaussian distributions** in probabilistic models, such as in Bayesian inference and maximum likelihood estimation.