data analysis Flashcards
What is an orthogonal matrix?
In linear algebra, an orthogonal matrix or real orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (i.e., orthonormal vectors) Its transpose is equal to the inverse.
How do you find the covarience matrix when given a set of vectors?
At the end you divide by N-1

After finding a covarience matrix C, you might apply eigenvector decomposition so that C=UDUT. What infomation can you find from U and D?
Columns of U are a new basis {u1, · · · , uN } for RN . Basis vectors un point in directions of maximum variance of X The eigenvector (element of D) dn is the variance of the data in the direction un
Suppose that each vector x ∈ X is replaced by its 1-dimensional projection x’ onto the direction defined in part (5). What is that average squared error between x and x’
Each point can be projected on the vector which gives the greatest varience.
There will be a certain varience between the orginal points and the projected ones.
The varience is given by the magnitude of the varience in the orthogonal direction, which is the other eigenvalue.
Is translation a linear transformation?
No
How would you use vector augmentation to perform a transformation?

If you wanted to rotate a vector through an angle about point A, what would you do?
You would then do Ax where x is the vector you wish to rotate.
If it was a column vector like [1;2] you would add a 1 to make [1;2;1] so that it could be multiplied.

What are all the DFT basis vectors in relation to each other?
Orthogonal. However they all have magnitude N0.5 so it doesn’t act as a basis.
If you had a 4*1 vector f, how would you find the DFT , F, of f?
Find the DFT basis vector for N=4
F=Df
For a given n, how do you find the corresponding DFT basis vector D?

What are some of the hints that make finding a DFT basis vector easier?
b0 only consists of 1s
b0.5*n goes 1,-1,1,-1,1…
What is the frequency resolution?
sample rate / frame length
The frame length is the number of sample points, N
What is the frame rate?
the reciprocal of the time betweent the sampling bands.
How do you find the number of frames processed?
Length of the sample / frame rate
How do you perform a reflection about a line at an ange to the horizontal?

What does a markov process consist of?
A matrix whose rows consist of positive numbers that sum to 1 is called a stochastic matrix

In a markov process, how can you find the probabilty of being in certain places at a time t?
pk=AtPt-1
pk=(At)kp0
On a markov transition matrix what does the aij entry denote?
The probabilty of going from i to j
When you have a set of data, and centroids, how do you find some better centroids?

How do you use the different distortion matrics e.g. d1(u,v) d2(u,v) and d∞ (u,v)
How do you find the distortion from some central centroids ?

Let X be a set of data points and C0 a set of K centroids in D-dimensional space. Questions: 1. Let C1 be the set of K centroids after 1 iteration of K-means clustering applied to C0 , X and the Euclidean metric d2. Is the centroid set C1 globally optimal? In other words, is it true that for any set of K centroidsC, Dist(C1, X) ≥ Dist(C1 , X)?
. No. The centroid set that you obtain from one iteration of k-means clustering depends on the initial estimates of the centroids. Different initial estimates will give different solutions.
Let X be a set of data points and C0 a set of K centroids in D-dimensional space:
Suppose t hat after n iterations the K-means clustering algorithm converges to a set C (in other words, applying K-means clustering to C with the data set X and the Euclidean metric d2 does not change C). Is the centroid set C globally optimal (as defined above)?
. No. As in the previous part of this question, the solution that the k-means algorithm converges to will also depend on the initial estimates of the centroids. Remember the MatLab demonstration that I showed in the lecture (and which you can run yourself using the code on Canvas). The final set of centroids after 20 iterations of k-means clustering (the blue circles) depends on the initial estimates (black circles).
How do you find the dot product with complex vectors?













