Week 3: Joint Distributions, Independent Random Variables, Conditional Distribution, Transformations of Random Variables Flashcards

1
Q

Joint Distributions

A

Joint distributions describe the probability distribution of multiple random variables simultaneously, mapping outcomes in the probability space to a space combining all target spaces.

This allows for the study of their interdependence and marginal probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When would Binomial Random Variable be a joint distribution and when would it not be

A

Univariate Binomial Random Variable (not joint distribution): X ~ Binomial (n , p) , single test consisting of n independent trials

Multivariate (joint distribution): 2 tests - X1 ~ Binomial (n1 , p1), X2 ~ Binomial (n2 , p2). Ex: successes in sales in region A and successes in sales in region B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

For which type of variable does one use Joint PMF

A

Discrete (die, coins, yes/no)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to calculate Joint PMF Values:

A

Ex: Two tests - Biased coin toss (60% heads), 3 sided die (33% for 1,2,3)

P(X1 = Heads, X2=1) = 0.6(1/3)=0.2
P(X1=Heads, X2=2) = 0.6
(1/3)=0.2
…..
P(X2=Tails, X2=3) = 0.4*(1/3)=0.133

Sum of all these probabilities = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Calculate the marginal PMF here of X1:

Two tests - Biased coin toss (60% heads), 3 sided die (33% for 1,2,3)

P(X1 = Heads, X2=1) = 0.6(1/3)=0.2
P(X1=Heads, X2=2) = 0.6
(1/3)=0.2
…..
P(X2=Tails, X2=3) = 0.4*(1/3)=0.133

A

Marginal PMF of X1:
px1(Heads)=(0.2+0.2+0.2)=0.6
px2(Tails)=(0.133+0.133+0.133)=0.4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For which type of variable does one use Joint PDF

A

Continuous (weight, height, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to calculate the Joint PDF values

A

fxx(x1,x2) * (Difference between full range and specified range X1) * (Difference between full range and specified range X1)

ex: (1/2) (1-0.5)(1-0.5), where 1/2 is the value of fxx(x1,x2) specified in PDF function for a range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Integrate x+y between 0 and 1:

A

f(x) = Integral(1,0) (x+y) dy = xy + (y^2)/2 = x + 1/2 (I get this by plugging 1 and 0 into the previous equation for y and subtracting results)
f(y) = Integral(1,0) (x+y) dx = (x^2)/2 + xy = 1/2 + y

From f(x), if I continue to integrate by dx on x + 0.5,
((x^2)/2 + x/2) is the result. I plug in 1 and 0. Subtracting x as 1 - x as 0 equations for this and the result is equal to 1.
1 is indicative of the coverage of the full sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to test if random variables are independent for joint PDF

A

f(x,y) = f(x) * f(y) if independence is true.
f(x,y) = integral integral (equation) dy dx
f(x) = integral (equation) dy
f(y) = integral (equation) dx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Condition Distribution for Joint PMF Formula

A

Pxy(X|Y) = Pxy(X,Y) / Py(Y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Use of Uniform distribution in creating random samples for distributions

A

I can take value between 0 and 1 from a uniform distribution and plug it into a quantile function (inverse) of any distribution type (ex Exponential) and the output of that will be a random sample from that desired distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to execute change of variables formula for discrete random variables (univariate)

A

Obtain random sample from original distribution (ex: P(X =1)=0.2, P(X=2)=0.5, P(X=3)=0.3)

Consider another distribution y=g(x) that is based on original distribution.

Ex: g(x) is evenness of variable X, g(0) is even and g(1) is odd.
g(1) = P(X=1) + P(X=3) = 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to execute change of variables formula for continuous random variables (univariate)

A

PDF fy(y) = fx(x) * (dx/dy g^-1(y))

Ex:
fx(x) is a uniform distribution over [0,2], fx(x) = 1/2 = y (P(X<=1)=0.5)
y=g(x)=X^2 for y in [0,4], transform that into it’s quartile, x = √y.

Take g^-1(y)=x and differentiate: dx/dy (√y) = 1 / (2√y)

PDF = fx(x) * (dx/dy g^-1(y)) = (1/2) * (1 / (2√y))
PDF = (1/2) * (1 / (2√y)) = 1 / (4√y) for y in [0,4]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to find the PDF of X if I know PDF Y, Y=g(X) (related to X), and g is strictly monotonic (always increasing or always decreasing) and continuously differentiable.

A

fx(x) = fy(g(x)) * |g’(x)|

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

For discrete independent random variables X & Y, the probability mass function (PMF) of their sum Z=X+Y is calculated by summing over all possible values x in the domain of X, using the formula:

A

pz(z) = Sum (px(x) * py(z-x)) , where p is a PMF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

For continuous independent random variables X and Y, the probability density function (PDF) of their sum Z=X+Y is found by integrating over all possible values of x, given by the convolution

A

fz(z) = Integral (fx(x) * fy(z-x)) dx

17
Q

Positive Definite Matrix

A

A symmetric matrix that satisfies (x^T)(Ax)>0 for all non-zero vectors x∈Rd.
This definition has important consequences:

  1. Invertibility: Positive definite matrices are always invertible, and their inverse A^−1 is also positive definite.
  2. Eigenvalues: All eigenvalues of a positive definite matrix are positive.
  3. Cholesky Decomposition: Positive definite matrices have a unique Cholesky decomposition, meaning there exists an upper triangular matrix R such that A = (R^T) * R
18
Q

Univariate ZCA (Zero Phase Component Analysis)

A

One can take a standard normal variable, scale it by a standard deviation σ, and shift it by a mean μ to obtain a normal variable with mean μ and variance σ^2 (it’d follow normal distribution N(μ,σ^2))

An on the other hand, one can standardize a normal variable by subtracting its mean and dividing by its standard deviation to obtain a standard normal variable.

19
Q

Coloring Transform

A

Involves taking data that may not be normally distributed and transforming it into a Gaussian distribution. This often requires linear transformations and adjustments to make the data conform to the properties of a Gaussian distribution

20
Q

Whitening

A

This is essentially the reverse process of the coloring transform. Whitening transforms data to have a mean of zero and a covariance matrix that is the identity matrix, effectively making the data “white” or uncorrelated. This process removes correlations between the variables and scales the data, making it suitable for certain statistical analyses or machine learning algorithms.

21
Q

Different transform operations:

A
  1. Mahalanobis or ZCA
  2. Cholesky
  3. PCA
22
Q

For a set of independent Gaussian random variables, what is their mean?

A

The sum of means from each random variable (applies to both univariate and multivariate)

23
Q

For a set of independent Gaussian random variables (univariate), what is their variance?

A

The sum of variances (σ^2) from each random variable

24
Q

For a set of independent Gaussian random variables characterized by a symmetric positive definite covariance matrix C (multivariate) what is their covariance?

A

Sum of the covariance matrices is the covariance matrix C of the resulting variable (the set)