data analysis Flashcards

1
Q

What is an orthogonal matrix?

A

In linear algebra, an orthogonal matrix or real orthogonal matrix is a square matrix with real entries whose columns and rows are orthogonal unit vectors (i.e., orthonormal vectors) Its transpose is equal to the inverse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you find the covarience matrix when given a set of vectors?

A

At the end you divide by N-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

After finding a covarience matrix C, you might apply eigenvector decomposition so that C=UDUT. What infomation can you find from U and D?

A

Columns of U are a new basis {u1, · · · , uN } for RN . Basis vectors un point in directions of maximum variance of X The eigenvector (element of D) dn is the variance of the data in the direction un

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Suppose that each vector x ∈ X is replaced by its 1-dimensional projection x’ onto the direction defined in part (5). What is that average squared error between x and x’

A

Each point can be projected on the vector which gives the greatest varience.

There will be a certain varience between the orginal points and the projected ones.

The varience is given by the magnitude of the varience in the orthogonal direction, which is the other eigenvalue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is translation a linear transformation?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How would you use vector augmentation to perform a transformation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If you wanted to rotate a vector through an angle about point A, what would you do?

A

You would then do Ax where x is the vector you wish to rotate.

If it was a column vector like [1;2] you would add a 1 to make [1;2;1] so that it could be multiplied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are all the DFT basis vectors in relation to each other?

A

Orthogonal. However they all have magnitude N0.5 so it doesn’t act as a basis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If you had a 4*1 vector f, how would you find the DFT , F, of f?

A

Find the DFT basis vector for N=4

F=Df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

For a given n, how do you find the corresponding DFT basis vector D?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are some of the hints that make finding a DFT basis vector easier?

A

b0 only consists of 1s

b0.5*n goes 1,-1,1,-1,1…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the frequency resolution?

A

sample rate / frame length

The frame length is the number of sample points, N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the frame rate?

A

the reciprocal of the time betweent the sampling bands.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you find the number of frames processed?

A

Length of the sample / frame rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you perform a reflection about a line at an ange to the horizontal?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a markov process consist of?

A

A matrix whose rows consist of positive numbers that sum to 1 is called a stochastic matrix

17
Q

In a markov process, how can you find the probabilty of being in certain places at a time t?

A

pk=AtPt-1

pk=(At)kp0

18
Q

On a markov transition matrix what does the aij entry denote?

A

The probabilty of going from i to j

19
Q

When you have a set of data, and centroids, how do you find some better centroids?

A
20
Q

How do you use the different distortion matrics e.g. d1(u,v) d2(u,v) and d∞ (u,v)

A
21
Q

How do you find the distortion from some central centroids ?

A
22
Q

Let X be a set of data points and C0 a set of K centroids in D-dimensional space. Questions: 1. Let C1 be the set of K centroids after 1 iteration of K-means clustering applied to C0 , X and the Euclidean metric d2. Is the centroid set C1 globally optimal? In other words, is it true that for any set of K centroidsC, Dist(C1, X) ≥ Dist(C1 , X)?

A

. No. The centroid set that you obtain from one iteration of k-means clustering depends on the initial estimates of the centroids. Different initial estimates will give different solutions.

23
Q

Let X be a set of data points and C0 a set of K centroids in D-dimensional space:

Suppose t hat after n iterations the K-means clustering algorithm converges to a set C (in other words, applying K-means clustering to C with the data set X and the Euclidean metric d2 does not change C). Is the centroid set C globally optimal (as defined above)?

A

. No. As in the previous part of this question, the solution that the k-means algorithm converges to will also depend on the initial estimates of the centroids. Remember the MatLab demonstration that I showed in the lecture (and which you can run yourself using the code on Canvas). The final set of centroids after 20 iterations of k-means clustering (the blue circles) depends on the initial estimates (black circles).

24
Q

How do you find the dot product with complex vectors?

A
25
Q

How do you find the magnitude of a complex vector?

A
26
Q

What is the Inverse Document Frequency?

A

N is the total number of documents

27
Q

What is the term frequency?

A

The Term Frequency (TF) ft.d of word t relative to document d is the number of times t occurs in d

28
Q

How do you find the TF-IDF weight?

A
29
Q

What is the TF-IDF similarity between two documents?

A
30
Q
A

The 6th and 7th column are vec(d1) and vec(d2)

Csim(d1,d2) is the cross product of vec(d1) and vec(d2)

31
Q

How does the linear dependancy of vectors work?

A

You detmine if they are dependent if you put them in a matix and are able to gaussioan eleminate to just get numbers in the diagonal. Then they are independent

If they are dependent, then one of the vectors can be written as a combination of the others.

32
Q

What defines a basis?

A

All the vectors are orthogonal with each other and have length 1.

33
Q

How do you apply the Gram-Schmidt algorithm?

A
34
Q

How is the ampltiude spectrum related to F?

A
35
Q

How do you find the inverse DFT?

A

N-1 *D’*F

The D’ is the complex concugate.

36
Q
A
37
Q

What would make a transfomation t linear?

A
38
Q

What is the definition of a basis?

A