Logistic Regression Flashcards

1
Q

What’s the equation for mean squared error? (multiple dimensions)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s the equation for the prediction of logistic regression?

A

sigmoid(thetaT*x) (I think)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does logistic regression output?

A

Calculates the probability of each class, and takes the class with the highest probability. The prediction is based on the values of a set of independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is this?

A

The output of logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some important characteristics to remember about logistic regression? (2)

A
  • easily interpretable
  • gives the probability of an event occurring, not just the predicted classification.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Can you apply linear regression to a classification problem?

A

Usually it’s a bad idea

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the output of logistic regression?

A

The argmax of probabilities (between 0 and 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is this?

A

The hypothesis of linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the hypothesis of logistic regression in:

  • words
  • equation form
A

The hypothesis of linear regression fed into the sigmoid function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the graph of logistic regression look like?

A

Sigmoid function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  • What is this?
  • How do you interpret it?
A
  • The probability expression of logistic regression’s output (before the argmax)
  • Probability that y=1, given x, parametrized by theta
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do the outputs of logistic regression add up to exactly 1?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How should you think of the prediction of binary logistic regression?

A

Predict 1 when θTx >= 0.5. Otherwise, 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can you solve for the line of the decision boundary for binary logistic regression?

A

Essentially, setting theta transpose x (the hypothesis of linear regression) equal to 0 is the equation for the decision boundary.

Steps:

  1. Try to get theta transpose * x.
  2. Plug intercept value into theta transpose * x
  3. Set that equal to 0
  4. treat x2 as y and x1 as x and solve for the equation of the line
  5. If the line is over the origin, then the half-space that doesn’t contain the origin predicts 1. If it’s under the origin, the half-space with the origin predicts 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is important to remember about the decision boundary of binary logistic regression?

A

h(x) = 0.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Can logistic regression take on a nonlinear decision boundary? If so, how?

A

Yes, by adding higher-order polynomial term features

17
Q

Can binary logistic regression have a decision boundary that is a circle?

A
  • Yes, if you use higher order polynomial features
18
Q

For logistic regression, can we use the same cost function that linear regression uses?

A

No. Because plugging the sigmoid (which is a nonlinear function) into the MSE equation makes for a nonconvex function

19
Q
  • What’s the cost function for logistic regression?
  • What does the graph look like?
  • What’s the intuition?
A

Intuition:

  • For y=1, as the h approaches 0, the penalty goes to infinity. Same idea for 0, except the graph of the cost function is flipped horizontally
21
Q

What do we know about the cost function for logistic regression? (3)

A
  • It’s derived from the principle of MLE
  • It’s convex
  • No closed-form solution for logistic regression because of the nonlinearity of the sigmoid
22
Q

What does learning in logistic regression do? Why?

A
  • We minimize the negative log conditional likelihood.
  • We can’t maximize likelihood (as in Naïve Bayes) because we don’t have a joint model p(x,y)
23
Q

What’s the cost function for logistic regression in compact form?

A
24
Q

What do we know about the negative average conditional log likelihood for logistic regression?

A

It’s convex

26
Q

What’s the softmax function?

A
27
Q

What’s the relation between the softmax function and the sigmoid function?

A
  • The sigmoid function is used for the two-class logistic regression, whereas the softmax function is used for the multiclass logistic regression
  • When num classes = 2, the softmax function reduces to the sigmoid function used for binary logistic regression. So in some sense, they’re the same
28
Q

What happens if theta^T * x = 0 during training for binary logistic regression?

A

Assuming no bias term, No change because the cost function penalizes based on the predicted probability and if that’s 0, the probability = 0.5.