empirical risk Flashcards
simple linear regression model
H(x) = w0 + w1x
slope in simple linear regression model
w1
intercept in simple linear regression model
w0
loss function
quantifies how bad a prediction is for a single data point
if our prediction is close to the actual value
we should have low loss
if our prediction is far from the actual value
we should have high loss
error
difference between actual and predicted values (yi - H(xi)
squared loss function
computes (actual - predicted)^2
constant model
Lsq(yi, h) = (yi - h)^2
another term for average squared loss
mean squared error
best prediction, h*
Rsq(h) = 1/n(Summation of i = 1, n) (yi - h)^2
constant model
H(x) = h
simple linear regression
H(x) = w0 + w1x
how do we find h* that minimizes Rsq(h)
using calculus
minimize Rsq(h)
- take its derivative with respect to h
- set it equal to 0
- solve for the resulting h*
- perform a second derivative test to ensure we found a minimum
derivative of Rsq(h)
-2/n(SUMMATION of n starting w/ i = 1)(yi - h)
Mean minimizes…
mean squared error
absolute loss
Labs(yi, H(xi)) = |yi - H(xi)|
average absolute loss
Rabs(h) = 1/n summation of n from i = 1 |yi - h|
to minimize mean absolute error
- take its derivative with respect to h
- set it equal to 0
- solve for the resulting h*
- perform a second derivative test to ensure we found a minimum
derivative of |yi - h|
it is a piece-wise function, so will be undefined
derivative of Rabs(h)
d/dh(1/n SUMMATION of n from i = 1, |yi - h|) = 1/n[#(h > yi) - #(h < yi)]
median minimizes
mean absolute error
best constant prediction in terms of mean absolute error
median
1. when n is odd, answer is unique
2. when n is even, any number between the middle two data points also minimizes mean absolute error
3. when n is even, define the median to be the mean of the middle two data points
process for minimizing average loss
empirical risk minimization
another name for “average loss”
empirical risk