Chapter 16 Singular Value Decomposition Flashcards

Question 1

Q

What’s Matrix decomposition/Matrix defactorization?

P 139

Answer

A

Matrix decomposition, also known as matrix factorization, involves describing a given matrix using its constituent elements.

Question 2

Q

Perhaps the most known and widely used matrix decomposition method is ____.

P 139

Answer

A

the Singular-Value Decomposition, or SVD

Question 3

Q

What makes SVD, more stable than other methods of matrix factorization?

P 139

Answer

A

All matrices have an SVD, which makes
it more stable than other methods, such as the eigendecomposition.

Question 4

Q

The formula for SVD matrix decomposition is as below:
A = U · Σ · Vh
What is each constituent part size?
And what are the column vectors of U and V called?

P 140

Answer

A

A is the real n × m matrix that we wish to decompose
U is a m × m matrix,
Σ (sigma) is a m × n diagonal matrix
Vh is a n × n matrix

The columns of the U matrix are called the left-singular vectors of A, and the columns of V are called the right-singular vectors of A.

Question 5

Q

The diagonal values in the Σ matrix are known as ____

P 140

Answer

A

the singular values of the original matrix
A.

Question 6

Q

The SVD can be calculated by calling the ____ function (from scipy). The function takes a matrix and returns the ____, ____ and ____ elements.

P 140

Answer

A

svd()
U, Σ and Vh

Question 7

Q

Can we reconstruct the original matrix directly, using what svd() function returns?

P 141

Answer

A

elements returned from the svd() cannot be multiplied directly. The s (sigma) vector must be converted
into a diagonal matrix using the diag() function. (then broadcasted to a m × n matrix of zeros)

Question 8

Q

What’s the pseudoinverse?

P 143

Answer

A

The pseudoinverse is the generalization of the matrix inverse for square matrices to rectangular matrices where the number of rows and columns are not equal.

Question 9

Q

How is pseudoinverse for rectangular matrices calculated?

P 143

Answer

A

The pseudoinverse is denoted as A⁺, where A is the matrix that is being inverted.The pseudoinverse is calculated using the singular value decomposition of A:
A⁺ = Vh^T· D⁺ · U^T
Where D⁺ is the pseudoinverse of the diagonal matrix Σ, U^T is transpose of U and Vh^T is transpose of Vh.

Question 10

Q

NumPy provides the function____for calculating the pseudoinverse of a rectangular matrix.

P 144

Question 11

Q

How is dimensionality reduction using SVD done?

P 145

Answer

A

To do this we can perform an SVD operation on the original data and select the top k largest singular values in Σ. These columns can be selected from Σ and the respective rows are selected from V_h.

Question 12

Q

If B is A after reducing dimension to k, using SVD how is B constructed? (formula)

P 145

Answer

A

Answer ⬇️

It’s constructed as below:
B = U · Σ_k · Vh_k

In practice, we can retain and work with a descriptive subset of the data called T.
This is a dense summary of the matrix or a projection.
T = U · Σ_k
this transform can be calculated and applied to the original matrix A as well as other similar matrices.
T = A · Vh^T_k

Question 13

Q

The scikit-learn provides a ____ class that implements SVD dimensionality reduction directly.

P 147

Answer

A

TruncatedSVD

from sklearn.decomposition import TruncatedSVD

Question 14

Q

Do we need to use all the components of Sigma to reconstruct the original matrix A?
| P 146 code

Answer

A

No, we can use the first K columns (containing singular values) of Sigma and consequently, the first K rows of Vh, to reconstruct the original matrix with a good precision.

The maximum K needed to reconstruct the original matrix is min(A.shape), more than that, isn’t going to make any difference because the values in Sigma matrix would be all zeros

Question 15

Q

Singular values in Sigma matrix of SVD decomposition, are sorted from largest to smallest. True/False
| P 145

Question 16

Q

The s, returned by the svd() function of scipy.linalg, has the same length as min(A.shape), given A is the original matrix. True/False

P 146 code

Answer

Study These Flashcards

A

True

I think this is why svd is used for sparse matrices.
It’s because #features can be bigger than #Observations, using SVD for dimensionality reduction is going to give us a way to reconstruct the original matrix, with smaller number of features (columns) than the original when we use T=U.Sigma_k
Since K at most is equal to the number of observations

Question 17

Q

Comparing the manual method of dimensionality reduction (with svd() function) with the sklearn truncatedSVD method, we can see the signs of some values might differ but the values match. does this cause issues?

P 147

.
For reducing the dimensionality of matrix A with SVD: T=A.Vh_k^T or T=U.Σ_k

Answer

Study These Flashcards

A

We can expect there to be some instability when it comes to the sign given the nature of the calculations involved and the differences in the underlying libraries and methods used. This instability of sign should not be a problem in practice as long as the transform -of the sklearn method- is trained for reuse.

Question 18

Q

The ____ approach to solving linear regression by minimizing square error, using matrix decomposition is the de facto standard.
This is because it is stable and works with most datasets. NumPy provides a convenience function named ____ that solves the linear least squares function using the ____ approach.

P 178

Answer

Study These Flashcards

A

pseudoinverse via SVD, lstsq(), SVD

Chapter 16 Singular Value Decomposition Flashcards

(18 cards)