General Flashcards
What are “labels”?
The thing we’re predicting. e.g. the “y” variable in a simple linear regression.
What is a “feature”?
An input variable, e.g. the “x” in a simple linear regression. There may be multiple input variables, indicated as {x1, x2, … xN}.
What is an “example”?
An instance of data, either “labeled” or “unlabeled”. Labeled examples are used to train the model, which can then predict the labels on unlabeled examples.
What is “training” and “inference”?
Training is using labeled data to create, or “learn”, the model. Inference is using the model to make predictions on unlabeled data.
What is “regression” and “classification”?
A regression model predicts continuous values, e.g. prices. A classification model predicts discrete values, e.g. spam or not spam, or the animal in an image.
What is “linear regression”?
Linear regression is a method for finding the straight line or hyperplane that best fits a set of points
What is “L2 Loss”?
Also known as “squared error”, this is simply the sum of the “(prediction - actual)2” for the examples.
How would you write the equation for a simple linear regression model?
y’ = b + w1 x1, where y’ is the predicted label, b is the bias (y intercept), w1 is weight of feature 1, and x1 is a feature (a known input). With more features, “b + w1 x1 + w2 x2…”
What is the process of trying many example models to find one that minimizes loss called?
Empirical risk minimization.
What is MSE?
Mean square error. The sum of the squared losses for example example, divided by the number of examples. It is commonly used, but not the only function, or best for all cases.
What does it mean that a model has “converged”?
In computing the model, the loss has reach a point with no or little improvement in each iteration.
What is the shape of the “loss” vs “weight” plot for regression problems?
Convex (i.e. bowl). They only have one minimum (i.e. one place with a slope is exactly 0).
What is the symbol “Δ”?
The uppercase form of the Greek letter delta, it is used to represent “change in”. e.g. in the “slop intercept” equation “y = mx + b”, then “Δy = m Δx”.
What is “∂y / ∂x”
The partial derivative of f with respect to x (the derivative of f considered as a function of x alone).
What is “∇f”
The gradient of a function, denoted as shown, is the vector of partial derivatives with respect to all of the independent variables.