ML | Linear models | Priority Flashcards
What is linear regression? What do the terms p-value, coefficient, and r-squared value mean? What is the significance of each of these components?
Statistics q4 p9
(See source material.) mu = sum_k[b_k*x_k]
(See source material.) MRLH pnemonic
Say you are running a multiple linear regression and believe there are several predictors that are correlated. How will the results of the regression be affected if they are indeed correlated? How would you deal with this problem?
(See source material.)
Equation for correlation coefficient r.
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p276
(See source material.)
Machine Learning with PyTorch and Scikit-Learn Chapter 9 p276
How does naive Bayes assign a class c to a document d (basic equation)? Why is this an example of a generative model?
(See source material.) Eq. (5.1). Uses a likelihood term, expresses how to generate the features of a document if we knew it was of class c.
Jurafsky SLP3E 5. Logistic regression
3 equations to standardize input features.
(See source material.) Eq. (5.8)
Jurafsky SLP3E 5. Logistic regression
Equation to normalize input features.
(See source material.) Eq. (5.9)
Jurafsky SLP3E 5. Logistic regression
Equation for softmax.
(See source material.) Eq. (5.15)
Jurafsky SLP3E 5. Logistic regression: 5.2.4 Choosing a classifier
Equation for L2 regularized objective.
(See source material.) Eq. (5.37)
Jurafsky SLP3E 5. Logistic regression: 5.2.4 Choosing a classifier
Equation for L1 regularized objective.
(See source material.) Eq. (5.39)
Jurafsky SLP3E 5. Logistic regression: 5.2.4 Choosing a classifier
Equation for a Gaussian prior on weights. How does this relate to regularization?
(See source material.) Eq. (5.40). L2 regularization corresponds to assuming that weights are distributed according to a Gaussian distribution with mean m = 0.
Jurafsky SLP3E 5. Logistic regression: 5.2.4 Choosing a classifier
Multinomial logistic regression: The loss function for a single example x
Sum of the logs of the K output classes, each weighted by their probability yk (Eq. 5.44). This turns out to be just the negative log probability of the correct class c (Eq. 5.45). (See source material.)
Jurafsky SLP3E 5. Logistic regression: 5.2.4 Choosing a classifier