wk7 Flashcards

1
Q

What is the cross entropy loss function

A

a measure of the difference / dissimilarity in two distributions, the target and predicted output:

-Sum: real output ln(p_1 prediction) + (1-real output)ln(1-p_1 prediction). It can further be simplified to:

-Sum over i: real output * w^T x^(i) - ln ( 1 + exp ( w^T.x^(i) ) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is derivative of cross entropy loss function

A

sum over i: p_1(x^(i),w) - y^(i))x^(i)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two problems with using simple gradient descent for logistic regression

A

1) can get stuck in local minima
2) if regression is two-dimensional and one dimension is of a higher magnitude, gradient descent can jump around and be inefficient, see image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the solution to the problems of simple gradient descent (overshooting and bouncing even in convex functions)

A

Use a hessian matrix, a matrix of the partial second-order derivative of each variable of the function. This tells us how rapidly the function changes and can help us adjust our step size rapidly
-We use the newton raphson method for an estimation of the weight but substitute in the hessian matrix to come up with a new iterative function
w = w_0 - H^-1(w_0) * Cross Entropy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does iterative reweighted least squares change with regularisation

A

only the first-order derivative of cross entropy changes to include the term lambda/2 * Norm(W). For example, see the image

-Cost function has Lambda . W added to it
-Hessian just has lambda added to it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In the case of a multi-class logistic regression problem how do we calculate the probability for an individual class

A

p_i = exp(w^T x) / 1 + sum over M-1 for each i-1: exp(w^Tx)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you check for multi-colinearity for multi-class regression problems

A

Check the Pearson correlation between each pair of output variables:
-if the correlation is greater than |0.8|, there is multicolinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you compute the hessian for a function

A

It is equal to the partial derivative w.r.t each other variables of the function. But the explicit values can be found by multiplying p1 with p0 and then multiplying that by x^Tx

How well did you know this?
1
Not at all
2
3
4
5
Perfectly