As presented, L1-penalty offers the advantage of rendering your coefficients' estimates sparse, and effectively it acts as a variable selector since it tends to leave only essential variables in the model. On the other hand, the selection itself tends to be unstable when data changes and it requires a certain effort to correctly tune the C parameter to make the selection most effective. As we have seen while discussing elastic net, the peculiarity resides in the behavior of Lasso when there are two highly correlated variables; depending on the structure of the data (noise and correlation with other variables), L1 regularization will choose just one of the two.
In the field of studies related to bioinformatics (DNA, molecular studies), it is common to work with a large number of variables based on a few observations. Typically, such problems are denominated p >> n (features are much more numerous than cases) and they present the necessity to select what features to...