Session 10 - Elastic Net Flashcards
What is the strength of penalty determined by for regularised regression?
Strength of tuning parameter lambda
Lasso is function of….
Sum of absolute values of coefficients
For regularised regression the estimates of β are obtained by what?
Minimising the penalized RSS (𝑅𝑆𝑆(𝜆) )
What does not force coefficients to 0 which means it cannot be used for variable selection and is not easy to interpret?
Ridge regression
Lasso regression….
Produces 0 coefficients and thus performs variable selection - Adds interpretability
Grouped variables: From several strongly correlated covariates typically one only is selected.
If the number of variables (p) is larger than the sample size (n), the lasso selects at most n variables – may be disadvantage if have very high dimensional data
Shrinking coefficient values has the effect of…..
Stabilizing the variability of the estimation, i.e. the complexity of the model; the penalization does add some bias.
If we want to predict GP visits based on 8 predictor variables what regularised regression method should we choose?
Ridge
Ridge: allows correlated variables to be included in the model but does not perform variable selection:
- not useful for a large number of variables
We have a cohort database with several hundreds of variables collected at baseline of patients “at risk of mental state” (ARMS).
About 15% of the patients developed a psychosis.
Based on the available variables we want to build a prognostic model to predict the likelihood of developing a psychoses.
This model should be used in clinical practice for a risk assessment.
What regularised regression method should we choose?
Lasso – as this is a clinical assessment want to ensure clinicians do not have to take into account too many variables
Lasso: performs variable selection but has got problems with selecting groups of correlated variables, such as set of genes or brain voxels
The aim of DNA micro array experiments is to detect differential gene expression.
E.g. to identify genes expression changes under different treatment conditions or among different types of cell samples.
Often thousands or hundred of thousands of genes are tested on the array for expression changes.
Research question:
We want to identify predictors of depression using a case-control dataset with 521135 gene expressions.
What regularised regression method should we choose?
We need a method which allows variable selection but selects a set of correlated variables
A combination of the Ridge and Lasso would sometimes be useful
- Elastic net regression is a hybrid approach that blends both penalization of the L2andL1 norms.
We want to predict autism
based on brain activity
differences when someone
looks at the person
3-Dimensional Data
64x64 voxel matrix
43 slices in the brain
= ≈176128 voxels
About 1/3 of this area is studied
- ≈ 50,000 voxels/ hypotheses tests
Add time as 4th dimension
and will will get into 100.000s
-Massive multiple
comparisons problem!
What regularised regression method should we use?
We need a method which allows variable selection but selects a set of correlated variables
A combination of the Ridge and Lasso would sometimes be useful
- Elastic net regression is a hybrid approach that blends both penalization of the L2andL1 norms.
What is elastic net regression?
Hybrid approach that blends both penalization of the L2andL1 norms.
It is a generalization of ridge, lasso and unregularized linear regression
What is the elastic net regression formula?
The l1 (“lasso”) part of the penalty
generates a sparse model and performs variable selection
The l2 (“ridge”) part of the penalty
Removes the limitation of the number of selected variables
Encourages grouping effect: Selects groups of correlated variables
What do we get if we set λ1 to 0 for elastic net?
l2 penalty or Ridge regression
What do we get if we set λ2 to 0 for elastic net?
L1 penalty or lasso regression
What do we get if we set λ1 and λ2 to 0 for elastic net?
“Normal” linear regression