Supervised Machine Learning – Regression and Classification Flashcards
What are 2 main types of machine learning?
Supervised and unsupervised learning
What is supervised learning and what are two main types of it?
Supervised learning as a type of machine learning where model is trained with input and output data (x, y) from which model is able to predict the output y based on input x that was never used in model training. Two main types are regression and classification.
What is regression?
Regression is a form of supervised learning where the model is predicting a specific number (ex. predict weight based on mouse size)
What is classification?
Classification is a type of supervised learning where the model is predicting a category for an input (small set of options) – ex. predict if on an image there is a dog or a cat.
What is unsupervised learning?
Unsupervised learning is a type of machine learning that tries to identify clusters or structure in unstructured data. Compared to supervised learning, there are no labels that will mark the output, you only have a bunch of input data and algorithm is trying to identify clusters without really knowing what they mean (ex. customer segmentation or defining type of people based on their genome sequence)
What types of unsupervised learning do we have?
Clustering – identify clusters
Anomaly detection – ex. used for fraud detection
Dimensionality reduction – reduce a big dataset to a smaller one (compress data)
What is the most common supervised algorithm that is used worldwide and how does it work?
That algorithm is linear regression. It tries to fit a straight line through the data where prediction will be located there on the line.
How do you mark a specific row in the training dataset?
Write down the linear regression function with one variable (unilateral)
f(x(i)) = Wx(i) + b
What is the most common error function that is used to calculate parameters of linear regression ?
Squared error function
How do you find values W and b in the linear regression function?
You will find it when you find a minimum of squared error function J(W,b).
What is the shape of a cost function with 2 parameters and how to visualize it in 2D?
It has a bowl shape. To visualize it in 2D, you can use a contour plot.
What is a gradient descent?
Gradient descent is an algorithm that is providing a structured way to minimize the function to a local minimum, in the case of linear regression minimize J(W1, W2, … Wn, b) = W1x + W2x + … Wnx + b.
It starts with some random values of parameters, calculating value of the function for all input/output values, and goes into a direction of steepest descent (adapt parameters) until it gets to the local minima.
There can be multiple local minimas for some non-linear functions like neural networks and in that case it depends from which initial parameters gradient descent has started.
What is a learning rate?
It is a simple constant that decides in gradient descent how big the step would be, or how big the changes of params W and b will be in each step. Bigger steps – faster convergence but higher possibility that the algorithm will overshoot the local minimum since the step is too big. It is marked with Greek letter α. The closer you are to the local minimum, slope is smaller which leads to smaller steps (smaller derivative of function J)
How to implement gradient descent?
Important thing to note that values W and b are updated simultaneously (at the same time). Incorrect way would be to update first W and then b since b will use a new value of W instead of the old one