empirical risk Flashcards

Question 1

Q

simple linear regression model

Answer

A

H(x) = w0 + w1x

Question 2

Q

slope in simple linear regression model

Question 3

Q

intercept in simple linear regression model

Question 4

Q

loss function

Answer

A

quantifies how bad a prediction is for a single data point

Question 5

Q

if our prediction is close to the actual value

Answer

A

we should have low loss

Question 6

Q

if our prediction is far from the actual value

Answer

A

we should have high loss

Question 7

Q

error

Answer

A

difference between actual and predicted values (yi - H(xi)

Question 8

Q

squared loss function

Answer

A

computes (actual - predicted)^2

Question 9

Q

constant model

Answer

A

Lsq(yi, h) = (yi - h)^2

Question 10

Q

another term for average squared loss

Answer

A

mean squared error

Question 11

Q

best prediction, h*

Answer

A

Rsq(h) = 1/n(Summation of i = 1, n) (yi - h)^2

Question 12

Q

constant model

Question 13

Q

simple linear regression

Answer

A

H(x) = w0 + w1x

Question 14

Q

how do we find h* that minimizes Rsq(h)

Answer

A

using calculus

Question 15

Q

minimize Rsq(h)

Answer

A

take its derivative with respect to h
set it equal to 0
solve for the resulting h*
perform a second derivative test to ensure we found a minimum

Question 16

Q

derivative of Rsq(h)

Answer

A

-2/n(SUMMATION of n starting w/ i = 1)(yi - h)

Question 17

Q

Mean minimizes…

Answer

A

mean squared error

Question 18

Q

absolute loss

Answer

A

Labs(yi, H(xi)) = |yi - H(xi)|

Question 19

Q

average absolute loss

Answer

A

Rabs(h) = 1/n summation of n from i = 1 |yi - h|

Question 20

Q

to minimize mean absolute error

Answer

A

take its derivative with respect to h
set it equal to 0
solve for the resulting h*
perform a second derivative test to ensure we found a minimum

Question 21

Q

derivative of |yi - h|

Answer

A

it is a piece-wise function, so will be undefined

Question 22

Q

derivative of Rabs(h)

Answer

A

d/dh(1/n SUMMATION of n from i = 1, |yi - h|) = 1/n[#(h > yi) - #(h < yi)]

Question 23

Q

median minimizes

Answer

A

mean absolute error

Question 24

Q

best constant prediction in terms of mean absolute error

Answer

A

median
1. when n is odd, answer is unique
2. when n is even, any number between the middle two data points also minimizes mean absolute error
3. when n is even, define the median to be the mean of the middle two data points

Question 25

Q

process for minimizing average loss

Answer

A

empirical risk minimization

Question 26

Q

another name for “average loss”

Answer

A

empirical risk

Question 27

Q

corresponding empirical risk when using the squared loss function

Answer

A

Rsq(h) = 1/n SUMMATION of n from i = 1 (yi - h)^2

Question 28

Q

if L(yi, h) is any loss function the corresponding empirical risk is

Answer

A

R(h) = 1/n(SUMMATION Of n from i = 1, L(yi, h)

Question 29

Q

Modeling recipe

Answer

A

choose a model
choose a loss function
minimize average loss to find optimal model parameters

Question 30

Q

empirical risk minimization

Answer

A

formal name for the process of minimizing average loss

Question 31

Q

corresponding squared loss function to Lsq(yi, h) = (yi - h)^2

Answer

A

Rsq(h) = 1/n Summation of n from i = 1 (yi - h)^2

Question 32

Q

For the mean

Answer

A

sum of distances below = sum of distances above

Question 33

Q

Mean is the point where

Answer

A

Summation of n from i = 1 (yi - h) = 0

Question 34

Q

Median is the point where

Answer

A

(yi < h) = #(yi > h)

Question 35

Q

Lp loss

Answer

A

Lp(yi, h) = |yi - h|^p

Question 36

Q

Corresponding empirical risk to Lp

Answer

A

Rp(h) = 1/n summation of n from i = 1|yi - h|^p

Question 37

Q

midrange minimizes

Answer

A

L(infinity loss)

Question 38

Q

As p –> infinity,

Answer

A

the minimizer of mean Lp loss approached the midpoint of the minimum and maximum values in the dataset or midrange

Question 39

Q

The general form of empirical risk for any loss function

Answer

A

R(h) = 1/n Summation of n from i = 1 (L(yi , h))

Question 40

Q

input h* that minimizes R(h) is…

Answer

A

some measure of the center of the dataset

Question 41

Q

minimum output R(h*) represents

Answer

A

some measure of the spread or variation in the dataset

Question 42

Q

Minimum value of Rsq(h)

Answer

A

Rsq(h*) = Rsq(Mean(y1, y2…yn))
= 1/n SUMMATION of n starting from i = 1(yi - Mean(y1, y2,… yn))^2

Question 43

Q

Variance

Answer

A

minimum value of Rsq(h) is the mean squared deviation from the mean, measures the squared distance of each data point from the mean, on average

Question 44

Q

standard deviation

Answer

A

square root of variance

Question 45

Q

empirical risk for absolute loss

Answer

A

Rabs(h) = 1/n summation of n starting from i = 1|yi - h|

Question 46

Q

Rabs(h) is minimized when

Answer

A

h* = Median(y1, y2,… yn)

Question 47

Q

Minimum value of Rabs(h) is…

Answer

A

mean absolute deviation from the median
(1/n)SUMMATION of n from i = 1|yi - Median(y1, y2,…yn)|

Question 48

Q

empirical risk for 0-1 Loss

Answer

A

R0,1(h) = 1/n Summation of n starting from i = 1 {0 - yi = h, 1 yi doesn’t equal h

proportion (between 0 and 1) of data points not equal to h

Question 49

Q

R0,1(h) is minimized when

Answer

A

h* = Mode(y1,y2…yn)

Question 50

Q

the minimum value of R0,1(h)

Answer

A

proportion of data points not equal to mode

Question 51

Q

simple linear regression model

Answer

A

H(x) = w0 + w1x

Question 52

Q

when using squared loss

Answer

A

h* = Mean(y1, y2… yn)
Rsq(h*) = Variance(y1, y2, … yn)

Question 53

Q

When using absolute loss

Answer

A

h* = Median(y1, y2… yn)
Rabs(h*) = MAD from median

Question 54

Q

R0,1(h) is minimized when

Answer

A

h* = Mode(y1,y2,… yn)
so therefore R0,1(h*) is the proportion of data points not equal to the mode

Question 55

Q

minimum value of R0,1(h) is the

Answer

A

proportion of data points not equal to mode

Question 56

Q

higher value means

Answer

A

less of the data is clustered at the mode

Question 57

Q

hypothesis function

Answer

A

H, takes in an x as input and returns a predicted y

Question 58

Q

parameters define

Answer

A

the relationship between the input and output of a hypothesis function

Question 59

Q

Since linear hypothesis functions are of the form H(x) = w0 + w1x, we can re-write Rsq

Answer

A

Rsq(w0, w1) = 1/n Summation n from i = 1 (yi - (w0 + w1xi))^2

Question 60

Q

Minimize mean squared error

Answer

A

Take partial derivatives with respect to each variable
set all partial derivatives to 0
solve the resulting system of equations
ensure that you’ve found a minimum, rather than a maximum or saddle point

Question 61

Q

We have a system of two equations and two unknowns (w0 and w1)
-2/n Summation of n from i = 1 ( yi - (w0 + w1xi)) = 0

-2/n Summation of n from i = 1 (yi - (w0 + w1xi))xi = 0

Answer

A

solve for w0 in first equation, result becomes best intercept
plut w0* into second equation and solve for w1

Question 62

Q

correlation

Answer

A

linear association, pattern that looks like a line

Question 63

Q

association

Answer

A

any pattern

Question 64

Q

correlation coefficient ,r

Answer

A

measure of strength of linear assocaition of two variables, x and y, measures how tightly clustered a scatter plot is around a straight line, between -1 and 1

Answer 62

A

average of the product of x and y when both are in standard units

Answer 63

A

units of y per units of x