data analysis Flashcards by pen 6816

What is an orthogonal matrix?

In linear algebra, an orthogonal matrix or real orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (i.e., orthonormal vectors) Its transpose is equal to the inverse.

How well did you know this?

Not at all

Perfectly

How do you find the covarience matrix when given a set of vectors?

At the end you divide by N-1

How well did you know this?

Not at all

Perfectly

After finding a covarience matrix C, you might apply eigenvector decomposition so that C=UDU^T. What infomation can you find from U and D?

Columns of U are a new basis {u1, · · · , u_N } for R^N . Basis vectors u_n point in directions of maximum variance of X The eigenvector (element of D) d_n is the variance of the data in the direction u_n

How well did you know this?

Not at all

Perfectly

Suppose that each vector x ∈ X is replaced by its 1-dimensional projection x’ onto the direction defined in part (5). What is that average squared error between x and x’

Each point can be projected on the vector which gives the greatest varience.

There will be a certain varience between the orginal points and the projected ones.

The varience is given by the magnitude of the varience in the orthogonal direction, which is the other eigenvalue.

How well did you know this?

Not at all

Perfectly

Is translation a linear transformation?

How well did you know this?

Not at all

Perfectly

How would you use vector augmentation to perform a transformation?

How well did you know this?

Not at all

Perfectly

If you wanted to rotate a vector through an angle about point A, what would you do?

You would then do Ax where x is the vector you wish to rotate.

If it was a column vector like [1;2] you would add a 1 to make [1;2;1] so that it could be multiplied.

How well did you know this?

Not at all

Perfectly

What are all the DFT basis vectors in relation to each other?

Orthogonal. However they all have magnitude N^0.5 so it doesn’t act as a basis.

How well did you know this?

Not at all

Perfectly

If you had a 4*1 vector f, how would you find the DFT , F, of f?

Find the DFT basis vector for N=4

F=Df

How well did you know this?

Not at all

Perfectly

For a given n, how do you find the corresponding DFT basis vector D?

How well did you know this?

Not at all

Perfectly

What are some of the hints that make finding a DFT basis vector easier?

b₀ only consists of 1s

b_0.5*n goes 1,-1,1,-1,1…

How well did you know this?

Not at all

Perfectly

What is the frequency resolution?

sample rate / frame length

The frame length is the number of sample points, N

How well did you know this?

Not at all

Perfectly

What is the frame rate?

the reciprocal of the time betweent the sampling bands.

How well did you know this?

Not at all

Perfectly

How do you find the number of frames processed?

Length of the sample / frame rate

How well did you know this?

Not at all

Perfectly

How do you perform a reflection about a line at an ange to the horizontal?

How well did you know this?

Not at all

Perfectly

What does a markov process consist of?

A matrix whose rows consist of positive numbers that sum to 1 is called a stochastic matrix

In a markov process, how can you find the probabilty of being in certain places at a time t?

p^k=A^tP^t-1

p^k=(A^t)^kp⁰

On a markov transition matrix what does the a_ij entry denote?

The probabilty of going from i to j

When you have a set of data, and centroids, how do you find some better centroids?

How do you use the different distortion matrics e.g. d₁(u,v) d₂(u,v) and d∞ (u,v)

How do you find the distortion from some central centroids ?

Let X be a set of data points and C⁰ a set of K centroids in D-dimensional space. Questions: 1. Let C¹ be the set of K centroids after 1 iteration of K-means clustering applied to C⁰ , X and the Euclidean metric d₂. Is the centroid set C¹ globally optimal? In other words, is it true that for any set of K centroidsC, Dist(C¹, X) ≥ Dist(C¹ , X)?

. No. The centroid set that you obtain from one iteration of k-means clustering depends on the initial estimates of the centroids. Different initial estimates will give different solutions.

Let X be a set of data points and C⁰ a set of K centroids in D-dimensional space:

Suppose t hat after n iterations the K-means clustering algorithm converges to a set C (in other words, applying K-means clustering to C with the data set X and the Euclidean metric d₂ does not change C). Is the centroid set C globally optimal (as defined above)?

. No. As in the previous part of this question, the solution that the k-means algorithm converges to will also depend on the initial estimates of the centroids. Remember the MatLab demonstration that I showed in the lecture (and which you can run yourself using the code on Canvas). The final set of centroids after 20 iterations of k-means clustering (the blue circles) depends on the initial estimates (black circles).

How do you find the dot product with complex vectors?

How do you find the magnitude of a complex vector?

What is the Inverse Document Frequency?

N is the total number of documents

What is the term frequency?

The Term Frequency (TF) f_t.d of word t relative to document d is the number of times t occurs in d

How do you find the TF-IDF weight?

What is the TF-IDF similarity between two documents?

The 6th and 7th column are vec(d₁) and vec(d₂) Csim(d₁,d₂) is the cross product of vec(d₁) and vec(d₂)

How does the linear dependancy of vectors work?

You detmine if they are dependent if you put them in a matix and are able to gaussioan eleminate to just get numbers in the diagonal. Then they are independent If they are dependent, then one of the vectors can be written as a combination of the others.

What defines a basis?

All the vectors are orthogonal with each other and have length 1.

How do you apply the Gram-Schmidt algorithm?

How is the ampltiude spectrum related to F?

How do you find the inverse DFT?

N^{-1 *}D'\*F The D' is the complex concugate.

What would make a transfomation t linear?

What is the definition of a basis?