Basics Flashcards

Question

Root Mean Squared Error (RMSE)

Answer 1

The mean squared error helps take the magnitude of errors into account, but because it squares the error values, the resulting metric no longer represents the quantity measured by the label. In other words, we can say that the MSE of our model is 6, but that doesn't measure its accuracy in terms of the number of ice creams that were mispredicted; 6 is just a numeric score that indicates the level of error in the validation predictions.

Answer 2

All of the metrics so far compare the discrepancy between the predicted and actual values in order to evaluate the model. However, in reality, there's some natural random variance in the daily sales of ice cream that the model takes into account. In a linear regression model, the training algorithm fits a straight line that minimizes the mean variance between the function and the known label values. The coefficient of determination (more commonly referred to as R2 or R-Squared) is a metric that measures the proportion of variance in the validation results that can be explained by the model, as opposed to some anomalous aspect of the validation data (for example, a day with a highly unusual number of ice creams sales because of a local festival).

Answer 3

a matrix to compare TP, TN, FP, FN (true positive, True neg, etc)

Answer 4

The simplest metric you can calculate from the confusion matrix is accuracy - the proportion of predictions that the model got right. Accuracy is calculated as: (TN+TP) ÷ (TN+FN+FP+TP)

Answer 5

call is a metric that measures the proportion of positive cases that the model identified correctly. In other words, compared to the number of patients who have diabetes, how many did the model predict to have diabetes? The formula for recall is: TP ÷ (TP+FN)

Answer 6

Precision is a similar metric to recall, but measures the proportion of predicted positive cases where the true label is actually positive. In other words, what proportion of the patients predicted by the model to have diabetes actually have diabetes? The formula for precision is: TP ÷ (TP+FP)

Answer 7

F1-score is an overall metric that combined recall and precision. The formula for F1-score is: (2 x Precision x Recall) ÷ (Precision + Recall)

Answer 8

Another name for recall is the true positive rate (TPR), and there's an equivalent metric called the false positive rate (FPR) that is calculated as FP÷(FP+TN).

Answer 9

A process that mimics the human mind, in a limited fashion. We train a model (or fit a model) so that inputs have weights that impact the observation. A higher weight has more impact.

Answer 10

loss function determines the overall variance, or loss, between predicted and actual label values.

Answer 11

Feature is the model input - these are the fields in the data that drive an answer (a label)

Answer 12

A label is the answer of a model - a yes/no or a value from a list. We look at features to predict a label

Answer 13

during preprocessing you determine features that influence the prediction.

Answer 14

k-means clustering

Answer 15

Feature selection removing data outliers impute missing values normalize numeric features

Answer 16

transparency

Answer 17

Azure Text Analytics supports chatbot integration, multilingual content, and confidence scoring. It recognizes about 120 languages. Document sizes must be under 5,120 characters.

Answer 18

You can add personality to a chatbot by providing answers that use a specific conversational tone. You use the chitchat feature to add the answers to a chatbot knowledge base.

Answer 19

Decision Forest

Answer 20

to simplify complex data sets by finding the most important features or dimensions.

Basics Flashcards

(47 cards)