ML exam 1 Flashcards
What is supervised learning?
to learn a model from labeled training data that allows us to make predictions about unseen or future data
Rosenblatt perception
- binary classification task
- positive class (1) vs negative class (-1)
-takes input as a dot product of input and weights
step function
1 if z >= theta
-1 if otherwise
what does z equal
the linear combination
rosenblatt perception algorithm
- initialize the weight to 0 or small number
- for each training sample x(i),
a. comput y hat or output value
b. update weights
weight update rule
w(j) = w(j) + deltaw(j)
perception learning rule
deltaw(j) = n(y(i) - y hat(i))xj(i)
linear separability
draw a line through the negative and positive class
convergence
convergence if guaranteed if the two classes are linearly separable and learning rate is sufficiently small
if classes cannot be separated,
Set a maximum number of passes over the training dataset
(epochs)
Set a threshold for the number of tolerated misclassification
Otherwise, it will never stop updating weights (converge)
diagram of Rosenblatt perception
see pic
Adaline
Weights updated based on a linear activation function
Remember that perceptron used a unit step function
φ(z) is simply the identity function of the net input
φ
Adaline diagram
see pic
adaline vs rosenblatt
The weight update is done based on all samples in training set
Perceptron updates weights incrementally after each sample
This approach is known as “batch” gradient descent
cost function and equation
ML algorithms often define an objective function
This function is optimized during learning
It is often a cost function we want to minimize
Adaline uses a cost function J(·)
Learns weights as the sum of squared errors (SSE)