Machine Learning NanoDegree Flashcards
apply bias and variance to ( underfitting || overfitting )
bias - underfitting, variance - overfitting
define the harmonic mean for (x, y)
2xy/x+y
what is the f1 score
the harmonic mean of precision and recall - raised a flag if any of the values are small
what is precision
the percent of labeled positives that are actually positive
what is recall
the percent of total positives that are label positive
What is fbeta score?
an f1 score that allows biasing towards either precision or recall. beta = 1 = harmonic mean, beta > 1 tends towards recall, 0 < beta < 1 tends towards precision
what is an ROC curve and how do you interpret it?
close to 1 is good, 0.5 is random,
What is r2 score and how do you interpret it?
the difference between a regression model and the simple averaging of all the points. close to 1 is good, close to 0 is bad
What is the point of having a bias node in a NN layer?
To provide the constant or intercept.
What is the ‘perceptron trick’ to get a line to move closer to a point?
subtract the the point vector (plus one for bias) time the learning rate from the linear equation if the point is negative labeled positive, add if the point is positive labeled negative.
What is the formula for multi-class entropy?
- sum(for i in p){ p[i] * log2(p[i]) }
what does ‘naive’ refer to in naive bayes?
assuming that all variables are independent.
a function must be ___ not ___ in order to be optimized
continuous, discreet
describe l2 regularization, including its alternate name
also called ridge regression, l2 regularization adds the square of the coefficients to the cost function, perhaps scaled by lambda. This works to penalize the model for being too complex and reduce overfitting.
describe l1 regularization, including its alternate name
also called lasso regression, l1 regularization adds the absolute value of the coefficients to the cost function, perhaps scaled by lambda. This works to penalize the model for being too complex and reduce overfitting. Reduces less important features to 0 and thus may be suitable for feature engineering.