Chapter 17 Introduction to Multivariate Statistics Flashcards
What’s covariance?
P 157
In probability, covariance is the measure of the joint probability for two random variables. It describes how the two variables change together.
cov(X, Y ) = E[(X − E[X]) × (Y − E[Y ])]
What does the sign of the covariace indicate?
P 157
The sign of the covariance can be interpreted as whether the two variables increase together (positive) or decrease together (negative).
NumPy does not have a function to calculate the covariance between two variables directly. Instead, it has a function for calculating a ____ called ____ that we can use to retrieve the covariance.
P 157
covariance matrix, cov()
The magnitude of the covariance is not easily interpreted. A covariance value of zero indicates that both variables are completely ____.
P 157
independent
How can we access the covariance, using the covariance matrix?
P 157
We access just the covariance for the two variables as the [0, 1] element of the square covariance matrix returned.
How can the covariance (obtained from covariance matrix) be normalized to a score between -1 and 1 to make the magnitude interpretable?
P 158
r =
cov(X, Y )/(SDX × SDY)
Where r is the correlation coefficient of X and Y , cov(X, Y ) is the sample covariance of X and Y and SX and SY are the standard deviations of X and Y respectively.
NumPy provides the ____function for calculating the correlation between two variables directly.
P 158
corrcoef()
Note: .corr() is NOT for numpy, it’s for pandas dataframes
Like cov(), it returns a matrix, in this case a correlation matrix. As with the results from cov() we can access just the correlation of interest from the [0,1] value from the returned squared matrix
What’s a covariance matrix?
P 159
The covariance matrix is a square and symmetric matrix that describes the covariance between two or more random variables.
The diagonal of the covariance matrix are the ____, as such it is often called the variance-covariance matrix.
P 159
variances of each of the random variables
How can we find the covarience between a 2D-array’s columns, using the cov() function?
P 159
The cov() function can be called with a single 2D array where each sub-array contains a feature (e.g. column). If this function is called with your data defined in a normal matrix format (rows then columns), then a transpose of the matrix will need to be provided to the function in order to correctly calculate the covariance of the columns.
The covariance matrix is not widely used in multivariate analysis. True/False
P 160
False
The covariance matrix is widely used in multivariate analysis.