Chapter 17 Introduction to Multivariate Statistics Flashcards

1
Q

What’s covariance?

P 157

A

In probability, covariance is the measure of the joint probability for two random variables. It describes how the two variables change together.

cov(X, Y ) = E[(X − E[X]) × (Y − E[Y ])]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the sign of the covariace indicate?

P 157

A

The sign of the covariance can be interpreted as whether the two variables increase together (positive) or decrease together (negative).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

NumPy does not have a function to calculate the covariance between two variables directly. Instead, it has a function for calculating a ____ called ____ that we can use to retrieve the covariance.

P 157

A

covariance matrix, cov()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

The magnitude of the covariance is not easily interpreted. A covariance value of zero indicates that both variables are completely ____.

P 157

A

independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can we access the covariance, using the covariance matrix?

P 157

A

We access just the covariance for the two variables as the [0, 1] element of the square covariance matrix returned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can the covariance (obtained from covariance matrix) be normalized to a score between -1 and 1 to make the magnitude interpretable?

P 158

A

r =
cov(X, Y )/(SDX × SDY)
Where r is the correlation coefficient of X and Y , cov(X, Y ) is the sample covariance of X and Y and SX and SY are the standard deviations of X and Y respectively.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NumPy provides the ____function for calculating the correlation between two variables directly.

P 158

A

corrcoef()

Note: .corr() is NOT for numpy, it’s for pandas dataframes
Like cov(), it returns a matrix, in this case a correlation matrix. As with the results from cov() we can access just the correlation of interest from the [0,1] value from the returned squared matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s a covariance matrix?

P 159

A

The covariance matrix is a square and symmetric matrix that describes the covariance between two or more random variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The diagonal of the covariance matrix are the ____, as such it is often called the variance-covariance matrix.

P 159

A

variances of each of the random variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can we find the covarience between a 2D-array’s columns, using the cov() function?

P 159

A

The cov() function can be called with a single 2D array where each sub-array contains a feature (e.g. column). If this function is called with your data defined in a normal matrix format (rows then columns), then a transpose of the matrix will need to be provided to the function in order to correctly calculate the covariance of the columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The covariance matrix is not widely used in multivariate analysis. True/False

P 160

A

False
The covariance matrix is widely used in multivariate analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly