IDL Flashcards

1
Q

How do neurons learn?

A

Learning by changing the topology & thickness of connections

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In “Vanilla” Recurrent Neural Network, what is the activation fn in output layer?

A

output layer activated by softmax function (can represent probability distribution over words)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In vanilla recurrent network, what is s(t) and y(t)?

A

s(t) is f(U w(t) + Ws(t-1)) where f = sigmoid activation function

y(t) = g(Vs(t)) where g is softmax activation function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the pocket convergence theorem

A

The pocket algorithm converges with a probability 1 to optimal weights even if the sets are not linearly separable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define Cover’s theorem

A

What is the probability that a randomly labeled set of N points in d dimensions is linearly separable?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Apply cover’s theorem in higher dimensional space

A

Id the number of points in d dimensions is less than 2*d, they are almost always linearly separable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Adaline? What is the activation function in Adaline?

A

Adaptive line element,

The difference between Adaline and the standard (McCulloch–Pitts) perceptron is that in the learning phase, the weights are adjusted according to the weighted sum of the inputs (the net). In the standard perceptron, the net is passed to the activation (transfer) function and the function’s output is used for adjusting the weights.

Identity function is the activation function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is LSTM better than Vanilla rec net?

A

Ability to learn which remote and recent information is relevant for given task and using it to generate output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly