1.2,.3: Expectation, covariance and Matrix Algebra Flashcards
What is the expected value of a random variable?
The expected value of a random variable X is its mean in the population
E(X) = ∑(x)x P(X1 = x)
i.e multiply the independent variable x (e.g hours of sleep) by the proportion of people that have that amount of the independent variable (e.g number of hours of sleep)
In the equation for the expected value:
E(X) = ∑(x)x P(X1 = x)
what are the rules if….
- Y = bX
- Y = X + a
- If Y = bX then E(Y ) = E(bX) = bE(X).
- If y = some constant times x, then the expected value of y is equal to the constant times the expected value of x - If Y = X + a then E(Y ) = E(X + a) = E(X) + a.
- If y = some constant plus x, then the expected value of y is equal to the constant plus the expected value of x
Example: if E(X) = 0.5 and Y = 2 X + 3, then
E(Y ) = 2 E(X) + 3 = 2 ·0.5 + 3 = 4
What is the expected value of Y if Y = X^2?
No rules for other transformations, e.g., E(X2) or E[g(X)], exist.
How do we deal with these other transformations when they arise?
Once we arrive at one of these expressions we need to stop as we can’t go any further, this is true for any expression of g(x) unless it is composed of the previous two rules.
An example that involves E(X2) is the variance:
Var(X) = E[(X −μ)^2],
We can simplify using the rules of expectation: Var(X) = E[(X −μ)^2] = E(X^2 −2μX + μ^2) = E(X^2) −2μE(X) + μ^2 = E(X^2) −μ^2.
What does the covariance of two variables quantify?
The covariance of two random variable X and Y quantifies how they co-vary in the population.
What is the equation for covariance?
Cov(X,Y ) = E[(X −EX)(Y −EY )]
i.e the expected value of the product of the deviations from each of the random variables from their respective population mean
Also give the short hand notation and the special case for covariance.
• Short hand: 〈X,Y 〉
• Special case: 〈X,X〉= E[(X −EX)^2] = Var(X)
-If you take the computed covariance of a random variable with itself, it is (X - EX)(X - EX) = (X - EX)^2 which is equal to the variance
Name and give the four rules of covariance
- 〈X,Y 〉= 〈Y,X〉 (symmetry)
- 〈X,X〉= Var(X) (variance)
- 〈bX,Y 〉= b〈X,Y 〉 (scalar multiplication)
- 〈X,Y + Z〉= 〈X,Y 〉+ 〈X,Z〉 (sum)
Suppose Y = −4X and Z = 2 + X. Then
〈Y, Z〉=〈−4X, 2 + X〉= −4〈X, 2 + X〉(rule 3)
= −4(〈X, 2〉+ 〈X, X〉) (rule 4, also covariance with a constant = 0)
= −4(0 + Var(X)) = −4Var(X) (rule 2)
We often need to compute the variance of a sum over random variables.
• Let U = X + Y + Z. What is Var(U)?
Var(U) = 〈U,U〉 = 〈U,X + Y + Z〉 = 〈U,X〉+ 〈U,Y 〉+ 〈U,Z〉 = 〈X + Y + Z,X〉+ 〈X + Y + Z,Y 〉+ 〈X + Y + Z,Z〉 = 〈X,X〉+ 〈X,Y 〉+ 〈X,Z〉 \+ 〈Y,X〉+ 〈Y,Y 〉+ 〈Y,Z〉 \+ 〈Z,X〉+ 〈Z,Y 〉+ 〈Z,Z〉 = 〈X,X〉+ 〈Y,Y 〉+ 〈Z,Z〉+ 2〈X,Y 〉+ 2〈X,Z〉+ 2〈Y,Z〉
Imagine that with U = X1 + X2 + ···+ Xp! What makes it easier to carry out these computations?
Matrix algebra
Why is matrix algebra often useful when it comes to data?
Data often comes in the form of a matrix. Much of the math is easier in terms of matrix algebra
Many multivariate techniques involve operations on covariance matrices:
•PCA & Factor Analysis
•MANOVA
What is a matrix?
A matrix is simply a list of numbers layed out in two dimension. e.g:
( a11 a12 a13 a14)
A = ( a21 a22 a23 a24)
(meant to be large brackets instead of two on top of each other)
Describe five conventions of matrices regarding variables denoting matrices, how we call the structure of a particular matrix, how we refer to this structure, denoting the individual items and what we write rather than the full matrix
• variables denoting matrices are bold upper case
• we say A is a “two by four matrix” (i.e., rows first)
• number of rows and number of columns are the dimensions
• the element of A in row i and column j is denoted aij
• we often write
A = (aij )
instead of the full matrix
what does a3. and a.8 refer to respectively?
- The i-th row of A is the row matrix ai.
- The j-th column of A is the column matrix a.j
- We can write A = (a.1; a.2; a.3; a.4)
a3. would be the 3rd row of a matrix
a. 8 would be the 8th column of a matrix
What is meant by the transpose of A?
The transpose of A swaps rows and columns:
A^T = (aij )^T = (aji)
( a11 a12 a13 a14) A = ( a21 a22 a23 a24) => ( a11 a21 ) A^T = ( a12 a22) ( a13 a23 ) ( a14 a24 )
Why would we transpose a matrix?
Two matrices A and B can be added and subtracted if their shapes are the same therefore we can transpose a matrix to match the shape of another in order to carry out calculations