Regression Flashcards
How can Linear Regression be extended?
With regularisers (L2 Regularisation)
What is regression even for?
For predicting continuous classes; where classification fails
What is Linear Regression?
Linear Regression is an attempt to build a linear model to predict the target values, by finding a weight for each attribute.
Captures a relationship between two variables or attributes
It makes the assumption that there is a linear relationship between the two variables
x = W0 + sum of WiAi
x = class
w* are the weights
a* are the attribute values
How to choose the best line for linear regression?
option 1) Finding the line that minimises the distance between all points and the line
- Euclidean distance: d(a,b) = sqrt(sumof(ai - bi)^2))
option 2) Least squares estimation: finding the line that minimises the sum of the squares of the vertical distances between approximated/predicted and observed
- minimise the Residual Sum of Squares (RSS) -> aka Sum of Square Errors (SSE):
RSS(Beta) = Sumof(yi - betaxi)^2
Which metric to use for linear regression to find the best line?
- Actual choice of metric isn’t that important, they’re all pretty stable
Just use either
- Root mean-squared error
- Root relative squared error
- Correlation coefficient
WTF is a Regression Tree?
Extension of Decision Trees, where the “class” (value) at each leaf is calculated by averaging over the values of all instances at that node
WTF is a Model Tree?
Generalised regression trees where the class at each leaf is calculated via linear regression over training instances at that node
Basically partitioning our data set and applying linear regression to each partition
As you work down the tree, the result at each leaf node is which linear regression model to use on our data set
Regression vs Model trees
Model trees have advantages over regression trees in both compactness and prediction accuracy, because model trees can exploit local linearity in the data
Regression trees will never give a predicted value lying outside the range observed in the training cases, whereas model trees can extrapolate
How to translate a regression task into a simple classification task?
Can map a continuous class onto discrete classes via DISCRETISATION
- Set range of continuous variables that corresponds to each discrete class
How to translate a classification task into a suite of regression task?
MULTI-RESPONSE LINEAR REGRESSION
- Perform one regression per discrete class
- With all positive instances set to 1 and all negative instances set to 0
- Classify a given test instance by estimating its value relative to each class, and selecting the class with the highest value
- Approximates a numeric membership function for each class
WTF is Maximum Likelihood Estimation?
Goal is to search for a value of Beta so that the probability P(y = 1 | x) = hbeta(X) is large when x belongs to the “1” class and small when x belongs to the “0” class (so that P(y = 0 | x) is large)
Linear Regression uses gradient descent, what does Logistic Regression use?
Tries to maximise so uses gradient ascent
Can Logistic Regression be applied to multi-class classification?
By default, no, only for binary classification.
However, can extend to multi-classification by assuming a multinomial distribution.
- Mutlinomial Logistic Regression
Applies softmax - a generalisation of the logistic function to J dimensions.
- Results in a J-Dimensional vector of real values in the range (0,1) that add up to 1
Logistic Regression Pros and Cons
PROS
- Simple yet low-bias classifier
- Unlike Naive Bayes not confounded by diverse, correlated features
CONS
- Slow to train
- Some feature scaling issues
- Often needs a lot of data to work well
- Choosing a regularisation a nuisance but important since OVERFITTING is a problem - adds constraints on the parameter space
What is Regression? How is it similar to Classification, and how is it different?
Regression is used for when the target attribute (class) is numeric (continuous). Consequently, we can't assess the likelihood of each class like we can in Classification.