Week2. Logistic regression as NN Flashcards

1
Q

Logistic regression for binary classification

A

Linear regression Y = wT *x + b Linear regression doesn’t work well as values could be big. Sigmoid function from linear regression gives values between -1 and 1 .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sigmoid function

A

G(z) = 1 / (1+e-z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why Cost function

A

We need to learn parameters p and WT given values X and Y of training exemples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Logistic regression loss function

A

Better NOT to use square error in this case (1/2*(Y-y)2

Good function for logistic regression is:

F(Y,y)= - ( y * log Y + (1-y) * log (1-Y) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Loss function for logistic regreassion

A

Is a loss function over all exemples

J(w,b) = - 1/m Sum(i=1,m)[y(i) * log Y(i) + (1-y(i) * log(1-Y(i) )]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Differencial calculus

A

Tries to find rate of change at each point on the curve of a function, or slope of the curve at each point.

Derivative - is a slope = height / width of a triangle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Integral calculus

A

Finds area under the curve of a function untill X axis. If you draw rectangles you can add them up to get area and if you make them infinitely small - it will be an exact area under the curve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Gradient decent

A

Updates for w:=w-alpha * dJ(w,b) / dw

and for b := b - alpha * dJ(w,b) / db

Derivative measures slope of the function.

In code we will use

dw = dJ(w,b) / dw

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Gradient decent for logistic regreasion - one example

A

z=w1x1+w2x2+b

a= Q(Z)

Loss(a,y)

Gradient decent

dz = a-y

dw1 = x1*dz

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Vectorization

A

Save a lot of time - SIMD - single instruction multiple data

Z = wT * x + b

Z=np.dot(w,x)+b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Vectorizing Logistic regresion - Forward propogation

A

For one example z1 = wTx1+b and a1=zigmoid(z1)

Vectorized for m exemples Z = np.dot(w.T,X) + b

Z, w b is broadcasted in to a vector

Dimention of vector X is (nx,m)

A=sigma(Z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Vectorizing Logistic regression Gradient output

A

Regular dz(1) = a(1) - y(1) … dz(m)

dZ = [dz(1) …dz(m)] / m

Vector version

dz = A - Y

dw = 1/m X dZT

db = 1 / m*(np.sum(dz))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Logistic regression full vectorized

A

Z = wT X + b = np.dot(w.T,X) + b

A = G(Z)

dz = A - Y

dw = 1/m X dzT

db= 1/m np.sum(dz)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Practice: Common steps for pre-processing a new dataset are:

A

Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, …)

Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)

“Standardize” the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly