Week3. Shallow Neural Networks Flashcards

1
Q

NN - logistic regreassion

A

Two logistic regressions one after another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Parts of NN

A

Input layer - X values of features - x = a[0] or “activations”.

Hidden layer (it’s not in the training set - so hidden) a[1] 1-4 for 4 dimentional vector if it has 4 points.

Output layer - a[2]

Each layer 1 and 2 has parameters w and b.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Computing one NN node

A

One logistic regression for the first node:

z1[1]= w1[1]T x + b1[1]

a1[1] = sigmoid( z1[1] )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Vectorised version for the whole NN

A

For two layers NN with 3 input and 4 hidden nodes

Z[1] - (4,1) matrix (4 nodes in 1st layer)

W[1] - (4,3) matrix - 4 nodes and 3 input activations

X - (3,1) vector - 3 activations (also a[0])

b[1] - (4,1) matrix (4 nodes in 1st layer)

a[1] - (4,1) matrix (4 nodes in 1st layer) = zigmoid (Z[1])

Z[1] = W[1] * a[0] + b[1]

a[1] = zigmoid ( Z[1])

To compute 2nd layer - same thing but differnt dimentions

Z[2] - (1,1) matrix (1 nodes in 2nd layer)

W[2] - (1,4) matrix - 4 nodes and 3 input activations

X - (4,1) vector - 4 activations from layer 1 (also a[1])

b[2] - (1,1) matrix (1 nodes in 2nd layer)

a[2] - (1,1) matrix (1 nodes in 2nd layer) = zigmoid (Z[2])

Z[2] = W[2] * a[1] + b[2]

a[2] = zigmoid ( Z[2])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Vectorizing across multiple exemples xm

A

We need to compute the NN for each example 1 to m to get a[2] for each exemple based on feature vector Xm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Derivatives of activation fucntions

A

Zigmoid a = g(z) = zig (Z) = 1 / (1 - e-z)

Derivative g|(z) = d/dz * g(z) = a (1-a)

Tanh g(z) = tanh(z) = (ez - e-z ) / (ez + e-z )

Derivative g|(z) = d/dz * g(z) = 1 - (tanh(z))2

ReLU g(z) = max(0,z)

Derivative g|(z) = (0 if z < 0 ) OR ( 1 if > 0 )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random initialization

A

Parameter W need to be initialized randomly.

W[1] = np.random.randn ( (2,2) ) * 0.001

Param b can be zero. b[1] = np.zeros((2,1))

W should be small otherwize learning would be too slow for sigmoid fucntion (binary classification).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly