11-logistic regression Flashcards
What is logistic regression?
Logistic regression is a binary classification model. It is a probabilistic discriminative model, because it optimises P(Y|x) directly. It doesn’t assume conditional independence
What are the log odds?
Log odds is a transformation used in the process of defining the logistic regression formula. It is calculated as the log(P(x) / (1-p(x)))
What is the logistic regression formula?
P(Y|x:theta) = 1/(1+e^-(regression formula))
How should the logistic regression function be interpreted?
If P(Y|X;theta) > 0.5, predict y = 1, otherwise y = 0
How does multinomial logistic regression compare to binomial logistic regression?
The probability of each class is calculated by passing through the softmax function, a generalisation of the sigmoid function
What are the pros of logistic regression?
It has a probabilistic interpretation
There are no restrictive assumptions on features
Often outperforms naive bayes
Particularly suited to frequency-based features
What are the cons of logistic regression?
It can only learn linear feature-data relationships
There are some feature scaling issues
Often needs a lot of data to work well
Overfitting can be a big problem
What is cross-entropy loss and its relation to negative log likeli-
hood?
Cross-entropy measures the difference between two probability distributions, p and q.
H(p, q) = − ∑p(x)log(q(x))
What happens if perceptron is applied to non-linearly separable data?
It will likely not converge, it will instead oscillate between multiple solutions
What is linear separability?
A dataset is linearly separable if we can separate all classes by drawing a line between them.
What is a linear classifier?
A classifier is linear if its decision boundary is a linear function