Chapter 17 Introduction to Multivariate Statistics Flashcards

Question 1

Q

What’s covariance?

P 157

Answer

A

In probability, covariance is the measure of the joint probability for two random variables. It describes how the two variables change together.

cov(X, Y ) = E[(X − E[X]) × (Y − E[Y ])]

Question 2

Q

What does the sign of the covariace indicate?

P 157

Answer

A

The sign of the covariance can be interpreted as whether the two variables increase together (positive) or decrease together (negative).

Question 3

Q

NumPy does not have a function to calculate the covariance between two variables directly. Instead, it has a function for calculating a ____ called ____ that we can use to retrieve the covariance.

P 157

Answer

A

covariance matrix, cov()

Question 4

Q

The magnitude of the covariance is not easily interpreted. A covariance value of zero indicates that both variables are completely ____.

P 157

Answer

A

independent

Question 5

Q

How can we access the covariance, using the covariance matrix?

P 157

Answer

A

We access just the covariance for the two variables as the [0, 1] element of the square covariance matrix returned.

Question 6

Q

How can the covariance (obtained from covariance matrix) be normalized to a score between -1 and 1 to make the magnitude interpretable?

P 158

Answer

A

r =
cov(X, Y )/(SD_X × SD_Y)
Where r is the correlation coefficient of X and Y , cov(X, Y ) is the sample covariance of X and Y and S_X and S_Y are the standard deviations of X and Y respectively.

Question 7

Q

NumPy provides the ____function for calculating the correlation between two variables directly.

P 158

Answer

A

corrcoef()

Note: .corr() is NOT for numpy, it’s for pandas dataframes
Like cov(), it returns a matrix, in this case a correlation matrix. As with the results from cov() we can access just the correlation of interest from the [0,1] value from the returned squared matrix

Question 8

Q

What’s a covariance matrix?

P 159

Answer

A

The covariance matrix is a square and symmetric matrix that describes the covariance between two or more random variables.

Question 9

Q

The diagonal of the covariance matrix are the ____, as such it is often called the variance-covariance matrix.

P 159

Answer

A

variances of each of the random variables

Question 10

Q

How can we find the covarience between a 2D-array’s columns, using the cov() function?

P 159

Answer

A

The cov() function can be called with a single 2D array where each sub-array contains a feature (e.g. column). If this function is called with your data defined in a normal matrix format (rows then columns), then a transpose of the matrix will need to be provided to the function in order to correctly calculate the covariance of the columns.

Question 11

Q

The covariance matrix is not widely used in multivariate analysis. True/False

P 160

Answer

A

False
The covariance matrix is widely used in multivariate analysis.

Chapter 17 Introduction to Multivariate Statistics Flashcards

(11 cards)