Fast group lasso regularised linear models in a sklearnstyle API.
Project description
The group lasso [1] regulariser is a well known method to achieve structured sparsity in machine learning and statistics. The idea is to create nonoverlapping groups of covariate, and recover regression weights in which only a sparse set of these covariate groups have nonzero components.
There are several reasons for why this might be a good idea. Say for example that we have a set of sensors and each of these sensors generate five measurements. We don’t want to maintain an unneccesary number of sensors. If we try normal LASSO regression, then we will get sparse components. However, these sparse components might not correspond to a sparse set of sensors, since they each generate five measurements. If we instead use group LASSO with measurements grouped by which sensor they were measured by, then we will get a sparse set of sensors.
About this project:
This project is developed by Yngve Mardal Moe and released under an MIT lisence.
Todos:
The todos are, in decreasing order of importance
Write a better readme
Code examples
Installation guide (after point 2.)
Better description of Group LASSO
Write more docstrings
Python 3.5 compatibility
Better ScikitLearn compatibility
Use Mixins?
Use randomness correctly
Classification problems (I have an experimental implementation, but it’s not tested yet)
Unfortunately, the most interesting parts are the least important ones, so expect the list to be worked on from both ends simultaneously.
Implementation details
The problem is solved using the FISTA optimiser [2] with a gradientbased adaptive restarting scheme [3]. No line search is currently implemented, but I hope to look at that later.
Although fast, the FISTA optimiser does not achieve as low loss values as the significantly slower second order interior point methods. This might, at first glance, seem like a problem. However, it does recover the sparsity patterns of the data, which can be used to train a new model with the given subset of the features.
Also, even though the FISTA optimiser is not meant for stochastic optimisation, it has to my experience not suffered a large fall in performance when the mini batch was large enough. I have therefore implemented minibatch optimisation using FISTA, and thus been able to fit models based on data with ~500 columns and 10 000 000 rows on my moderately priced laptop.
Finally, we note that since FISTA uses Nesterov acceleration, is not a descent algorithm. We can therefore not expect the loss to decrease monotonically.
References
[1]: Yuan, M. and Lin, Y. (2006), Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68: 4967. doi:10.1111/j.14679868.2005.00532.x
[2]: Beck, A. and Teboulle, M. (2009), A Fast Iterative ShrinkageThresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences 2009 2:1, 183202. doi:10.1137/080716542
[3]: Oâ€™Donoghue, B. & CandÃ¨s, E. (2015), Adaptive Restart for Accelerated Gradient Schemes. Found Comput Math 15: 715. doi:10.1007/s102080139150
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for group_lasso0.1.1py3noneany.whl
Algorithm  Hash digest  

SHA256  d7ab6a8d1275653c201ee5ef683fcfbb2a343ef2c81d978ccf3bb098399ecd13 

MD5  fd10a239e65ce8e8842ca8d24657efd1 

BLAKE2b256  ded3c191c85513a8f255e943a20f7dbe6e844ee37362ae3545215924446a8535 