Train And Evaluate Regression Models Flashcards
Regression is a commonly used kind of machine learning for predicting numeric values
Regression is where models predict a number.
In machine learning the goal of regression is to create a model that can predict a numeric quantifiable value such as a prize amount size or other scalar number
Regression is a statistical technique of fundamental importance to science because of its ease of interpretation robustness and speeding calculation.
Regression models provide an excellent foundation for understanding how more complex machine learning techniques work
In real-world situations part regularly when little daughter are available regression models are used are very useful for making predictions.
For example if a company that rents bicycles wants to predict the expected number of rentals on any given day in the future a regression model can predict this number.
You could create a model using existing data such as the number of bicycles that will rent it on days where the season day of the week and sell one who also recorded
Regression works by establishing a relationship between variables and the data that represents characteristics known as the features of the thing being observed and the variable we are trying to predict known as the label
To train the model we start with a data sample containing the features as well as the known values for the label
The data sample is split into two subsets:
A training darts it to which will apply an algorithm that determines a function encapsulating the relationship between the feature values and their own label values.
A validation or a test dataset that we can use to evaluate the model by using it to generate predictions for the label and comparing them to the actual none label values.
The use of historic data with no label values to train a model motor regression an example of supervised machine learning
Note.
Machine learning is based in statistics and math and it is important to be aware of specific terms that statisticians and mathematicians and their for data scientists use.
You can think of the difference between a predicted label value and the actual label value as a measure of error.
However in practice the actual values are based on the sample observations which themselves might be subject to some random variants.
To make it clear that we are comparing a predicted value with then observed value their efforts of the difference between them as the residuals.
We can summarise the residuals of all of the validation data predictions to calculate the overall loss in the model as a measure of its predictive performance.
One of the most common ways to measure the last is to square the individual residuals sum of the squares and calculate the mean.
Squaring the residuals has the effect of basing the calculation on Absolute values and ignoring whether the difference is negative or positive and giving more weight to larger differences.
This metric is called the mean squared error
Sometimes it is more useful to express the last in the same unit of measurement as the projected labelvalue itself.
It is possible to do this but calculating √ the NSA or the mean Square error which produces a metric known and surprisingly as the root mean Square error or rmse
There are many other metrics that can be used to measure loss in a regression.
For example are R Squared sometimes known as coefficient of determination is the correlation between X and Y squared.
This produces a value between 0 and 1 that measures the amount of variance that can be explained by the model.
Generally the closer the value is to 1 the better the model predicts
You can quantify the residuals for calculating a number of commonly used evaluation metrics.
Mean Square error or mse:
The mean of the squared differences between predicted and actual values.
This yields a relative metric in which the smaller the value the better the models fit
Root mean Square error or RMS E-Toll on new line √ the msc.
This yields and absolute metric in the same unit as the label.
The smaller the value the better the model in a simplistic since it represents the average number by which the predictions are wrong
Coefficient of determination usually known as r squared:
A relative metric in which the higher the value the better than models with.
In essence this metric represents how much of the variance between predicted and actual label values the model is able to explain
Regression models are often chosen because they work with small data samples are robust easy to interpret and a variety exist
Linear regression is a simplest form of regression with no limit to the number of features used.
Linear regression comes in many forms often named by the number of features used and the shape of the curve that fits
Decision trees.
Decision trees take a step-by-step approach to predicting a variable. If you think of a bicycle example the decision tree might be first split examples between ones that are only during summer and spring and autumn and winter make a prediction based on the day of the week spring or summer and one day I might have a bike rental rate of 100 per day for autumn and winter on a Monday but have a rental rate of 20 per day
Simple models with small datasets can often be fit in a single step while larger data sets and more complex models must be fit by repeatedly using the model with training data and comparing the outfit with the expected label.
If the prediction is accurate enough we consider the model train. If not we just adjust the model slightly and loop again
Improve models with hyperparameters.
Hyperparameters are values that change the way the model is fit during these lips.
Learning rate for example is a hyperparameter that sets how much are model is adjusted during each training cycle.
A high learning rate means a model can be trained faster but if it is too high the adjustments can be so large that the model is never finally turned and not optimal
Pre-processing data.
Pre-processing refers to changes you make to your data before it’s passed to the model.
We previously read that pre-processing can involve cleaning your daughter set.
Well this is important pre-processing can also include changing the format of your daughter so it is easier for the model to use.
For example data described as red orange or yellow lime and green might work better if converted into a format more native to computers such as number stating the amount of red and the amount of green
Scaling features.
The most common pre-processing step is to scale future so that they fall between 0 and 1.
For example the weight of a bike and the distance a person travels on a bike maybe two very different numbers but by scaling both numbers to between 0 and 1 allows models to learn more effectively from the data
Using categories as features.
In machine learning you can also use categorical features such as bicycle skateboard or car.
These features are represented by 0 or 1 values in one hot vectors
Vectors that have a zero or one for each possible value.
For example bicycle skateboard and car might respectively be (1.0.0), (0,1,0) & (0,0,1)
You have created a model object using the scikit learn linear regression class. What should you do to train the model?
Call the fit method of the model object specifying the training feature and label arrays
You train a regression model using the scikit learn. When you evaluated with test data you determined that the model achieves an r-squared metric of 0.95. What does this metric tell you about the model?
the model explains most of the variance between predicted and actual values
While scikit-learn is a popular framework for writing code to train regression models you can also create machine learning solutions for regression using the graphical tools in Microsoft azure machine learning. You can learn more about local development of regression models using a z machine learning in the create a regression model with AZ machine learning designer model