L7 - L1 Regression Flashcards

1
Q

What is the loss function under L1 regression?

A

Instead to taking the sum of squared deviations we take the mean absolute deviations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you minimise the L1 loss function?

A

can’t do it normally as it isn’t differentiable - have to create a linear program.

  • You can to rewrite t as an inequality to make it work
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Matrix form of the L1 regression?

A

Have the negative and positive forms for both sides of the inequality in the linear program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Example of L1 of the matrix form?

A
  • Have the negative and positive forms for both sides of the inequality in the linear program
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are training and testing sets?

A

—The training set is the subset of data which is used to train the model: it is the same of fitting. These data are used to solve the optimization problem.

—The solution x* is then used to measure the accuracy of the model for predictions (test).

—In this case, new data A1, A2,…,An are fed into the model and the difference between the model prediction and the truly registered data is measured.

  • we do this to prevent overfitting –> where our model works on data is has seen but performs poorly on what it hasn’t seen
  • To do this we look at the test error instead of the training error ( the error on the data that we haven’t seen that shows us truly how good the model is)
    • Train error < testing error ( so what to minimise the training error first)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can we measure the accuracy of our regressions?

A

Using the following three methods:

—R^2, the coefficient of model determination

—The root mean squared error RMSE

—The mean absolute deviation MAD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Formula for R^2?

A

—The denominator represents the total variability of the response values in the sample (a part from a “divided by n” it is in fact the sample variance of the y values).

—This denominator is often called the total sum of squares and abbreviated with SST.

—The numerator is a similar quantity but regarding the predicted rather than the observed response values: it measures the variability of the model predictions.

—The numerator is often referred to as the regression sum of square, denoted as SSR.

—Hence R-squared can also be denoted as R2 = SSR/SST.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can we interpret the R2 value?

A

—Since it is a proportion, R-squared is a value in between 0 and 1.

—The closer to 1, the better the model (i.e. the predictor) in predicting the response variable.

—The closer to 0, the less useful the model (i.e. the predictor) in terms of predictive power.

—Remark: too much relevance in our opinion is attributed to the R-squared in practice. The misleading idea that is commonly accepted is that a regression model should be considered good only if it has a high R-squared. While it is true that the higher the R-squared, the more useful the predictions, remind that the R-squared value depends on our choice of the predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you calculate the Root Mean Squared Error?

A
  • The loss function in gauss’s problem (l2 regression)
    • don’t compare with l1 as this will be biased towards the l2
  • if you find the root mean squared error just using the training set this will give us the minimised objective function/ loss function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you calculate the Mean Absolute Deviation?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ccHow can you test a line of regression in MATLAB?

A
  • A= [1, A1(1,:)]
    • where the 1 in the A1 bracket signifies the line of the A matrix)
  • b=B1(1.1)
  • b - A*x
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you plot the dual feasible region?

A

LinProgPlot(c,A,b,’d’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the Dantzig Selector?

A
  • If we have a loss function of the mean deviations of the absolute value we can create a linear program with it
    • Becomes a Linear Optimisation
    • So with L2 regression you use Gauss, with L1 you use Dantzig
How well did you know this?
1
Not at all
2
3
4
5
Perfectly