Impact of depth and design choice Flashcards

Question 1

Q

What does the Universal approximation theorem say?

Answer

A

A single hidden layer neural network with any “squashing” activation function and with a linear output unit can approximate any continuous function arbitrarily well, given enough hidden units.

In the worst case, an exponential number of hidden units (possibly
with one hidden unit corresponding to each input configuration
that needs to be distinguished) may be required. (2^n parameters)

Question 2

Q

What are the needs for Supervised training?

Answer

A

labeled training set
vector of model parameters
loss function L(fθ(x), y)
Training = find θ that minimizes the total loss on the training set

Question 3

Q

What are the needs for unsupervised training

Answer

A

unlabeled training set
vector of model parameters
loss function L(fθ(x))
Training = find θ that minimizes the total loss

Question 4

Q

What is a detection task? What output activation function is used?

Give an example

Answer

A

Only 2 possible classes : 0 and 1. The thing is either detected or not.
Sigmoid

CAPTCHA

Question 5

Q

What is a classification task? What output activation function is used?

Give an example

Answer

A

3 or more classes, similar to detection
Softmax activation function

What kind of animal is in the picture: leopard, egyptian cat, jaguar, ..

Question 6

Q

What is a regression task? What output activation function is used?

Give an example

Answer

A

the model is trained to learn the relationship between input variables and a continuous target variable.
Linear activation

estimating a house location, size, etc

Question 7

Q

Define what an autoencoder is

Answer

A

Neural network trained to predict its input
It is unsupervised learning

Question 8

Q

How does the autoencoder work?

Answer

A

Consists of two parts:
* an encoder function h = f (x)
* a decoder function xˆ = r(h) such that xˆ ≈ x
The hidden activations h provide a nonlinear representation of the input called an embedding.

Question 9

Q

Define what an undercomplete autoencoder is

Answer

A

Encoder where the embedding h
has fewer dimensions than the input x.

Question 10

Q

What is the perk of denosing the autoencoder

Answer

A

Forces the autoencoder to learn to undo the corruption, forcing it to learn saliant features
obtain embeddings h whose dimension is
bigger than the data x.

Question 11

Q

Explain Synthetic data generation

Answer

A

compute and draw samples from p(x).
For discrete data, treat as a series of classification tasks:
p(x) = p(x1) × p(x2|x1) × . . . p(xn|x1, . . . , xn−1).

Question 12

Q

What can you use Synthetic data generation for?

Answer

A

Language modeling and text generation
Augment existing data
Testing and validaiton

Question 13

Q

Impact of depth and design choice Flashcards

(13 cards)