Exam Flashcards

Question

What is the idea behind FFT?

Answer 1

Split the vector into odd, and even indicies and perform a fourier transform on them seperatly. This can be done repetadly and reduces computational complexity from O(n^2) to O(n log n)

Answer 2

Iff they commute AB = BA

Answer 3

1) Stack the columns of A to a vector 2) Extend the shift matrix to N^2 x N^2 3) Multiply and restack to a matrix.

Answer 4

Apply the Fourier transform row-wise than the inverse fourier transform column wise.

Answer 5

Additions: 124*124*3*10*5*5

Answer 6

Use pooling

Answer 7

Use several layers with small kernels (3x3) instead of one large layer.

Answer 8

1) Density estimation 2) Clustering 3) Feature learning 4) Dimensionality reduction 5) Data generation

Answer 9

Creates a smoothed histogram of the datapoints.

Answer 10

z -> (Gen) -> x' | x' and x -> Disc -> Classification (Real/ Fake)

Answer 11

[z, c'] -> Gen -> x' | [x', c'] and [x, c] -> Disc -> Classification (Real/ Fake)

Answer 12

For Discriminator: sum log(D(z)) + log(1 - D(G(z)) For Generator: sum log(D(G(z))

Answer 13

1) Draw random batch from real samples 2) Draw random z batch and create fake samples G(z) 3) Do gradient descent on discriminator on both batches 4) Run discriminator on samples from generator 5) Do gradient descent on generator

Answer 14

1) Use Relu in generator and leaky relu in disc. 2) Use fully convolutional network 3) Use batch norm in all layers but the last for disc and first for gen. 4) Use tanh on output 5) Replace pooling with strided and fractional strided convolutions

Answer 15

A normal autoencoder where we also have a discriminator on the latent z. The discriminator can force the z-distribution to be similar to a pre-choosen distribution.

Answer 16

Use a GAN. Feed the scetch to the genereator and feed pairs of [scetch, generated image] and [scetch, true image] to the discriminator.

Answer 17

1) Sentement analysis 2) Image captioning 3) Translation

Answer 18

1) Feedback connection to a former cell 2) Feedback to a cell in the same layer 3) Feedback from the ouput to the input of the same cell

Answer 19

A network with symetric connections to all cells

Answer 20

Batch normalization, dropout, bi-directional RNN.

Answer 21

change from basic RNN cell to LSTM or GRU

Answer 22

1) f = sigmoid( W_f* [h,x] + b_f) is multiplied elementwise by the cell state, can reduce the importance of elements in the last state. 2) i = sigmoid( W_i* [h,x] + b_i) is elementwise multiplied by the tanh of the input, controlling what is added to the cell state 3) o = sigmoid( W_i* [h,x] + b_i) is elementwise multiplied by the tanh of the cell state, controlling what is sent to the output.

Answer 23

c = c'*f + g*i where * is elementwise multiplication and g = tanh(W_g [x', h'] + b_g) and ' indicates the last time step.

Answer 24

connections from previous cell state to the input of all gates.

Answer 25

Reset gate, and update gate.

Answer 26

We can just use the last output or we can use mean/max/sum over all outputs.

Answer 27

When we have a input where we don't want to label each "frame", but the whole input at once. For example audio to text.

Answer 28

ln sum Pr(p|X) for all paths with the correct output.

Answer 29

Using the forward-backward algorithm. If a(s) denotes all paths to node s, and b(s) denotes the probability of all paths from s, then L = ln sum a(s)b(s) for all s in a correct path.

Answer 30

Choosing the most probable path, does not necessarily give the most probable output. This can be improved by beam search, prefix search decoding and constrained decoding.

Answer 31

In BPTT we run the forward phase for the entire sequence (over time) without updating the weights. We then calculate the loss of all outputs of the RNN and let the gradient propagate back to eariler states (trough time).

Answer 32

Use a RNN "encoder" to get a single output form a sequence, and a RNN decoder to turn the ouput into a sequence. Can also use "attention" a combination of the ouputs from each cell in the encoder that is used as additional input to each cell in the decoder.

Answer 33

1) Generative model -> Explicit density -> approximate density -> Variational 2) Generative model -> Implicit density -> Direct

Answer 34

A discriminative model attempts to estimate p(y|x), the probability of something given hte data. A generative model attempts to estimeate p(y,x). This distribution can then be used to sample from. Both models can be used for both regression and classification problems.

Answer 35

D*(x) = p_data(x) / (p_G(x) + p_data(x))

Exam Flashcards

(60 cards)