PASPAL: Proximal Algorithms for SPArse Learning (MATLAB code)

Sofia Mosci and Lorenzo Rosasco

Last updated October 27 2010.

Content

Description of the algorithm

This set of MATLAB toolboxes contain an implementation of the learning algorithms described in [1-4].

The considered algorithms are different regularization approaches to structured sparsity, where the empirical least squares error for a linear model (y = x &beta) is penalized with a sparsity enforcing penalty. In particular we considered the following penalties (see [1,2] and references therein for details):

Lasso or l1 penalty: the sum of the absolute values of the regression coeffcients &beta. It enforces the solution to depend on a (small) subset of variables.

(Naive) elastic net or l1-l2 penalty: combination of the l1 and the l2 norm (sum of the squared values) of the regression coeffcients. By varying a suitable "smoothness" parameter l1l2-regularization allows trading sparsity for the inclusion of correlated variables. In fact, while the l1 penalty enforces the solution to depend on a (small) subset of variables, the l2 norm maintains correlation by forcing all variables correlated to the relevant ones, to be selected as well.

Group Lasso penalty: given a group partition of the input variables, the group lasso penalty also known as l1/l2 penalty is the sum (i.e. the l1 norm) of the l2 norms of the regression coefficents restricted to each group. It enforces the support of the solution to be the union of a subset of the groups.

Group Lasso with overlap penalty: given a set of possibly overlapping groups of the input variables, the group lasso with overlap penalty generalizes the standard group lasso penalty to overlapping groups, in that it forces the support of the solution to be the union of a subset of the groups.

The implementation is based on proximal methods, in particular on the a la Nesterov optimization procedure that was independently proposed by Guler (1992) and Beck and Teboulle (2009) with the name FISTA. Such a proximal method is an iterative procedure, which convergence rate is guaranteed. When computing the regularization path, i.e. the set of solutions corresponding to different values of the regularization parameter (&tau_1<&tau_2<...<&tau_max), we employ the continuation strategy proposed in Hale Yin and Zhang (2008): the first computed solution is the one corresponding to &tau_max, then solutions for decreasing values of &tau are computed using the solution for &tau_t as initialization for computing the solution for to &tau_t-1.

Description of the code

The toolboxes are:

  • L1L2_TOOLBOX (lasso and elastic net regularization, AlgoName = "l1l2")
  • GROUP-LASSO_TOOLBOX (group lasso, AlgoName="grl")
  • GLOPRIDU_TOOLBOX (group lasso with overlap, AlgoName = "glopridu")

Each toolbox contains:

  • AlgoName_algorithm.m: learning algorithm
  • AlgoName_regpath.m: function for computing the regularization path
  • AlgoName_kcv.m: function for performing the K-fold/LOO cross-validation framework
  • AlgoName_tau_max.m: function estimating the maximum value for the regularization parameter
  • AlgoName_learn.m: function for building a predictive model for a given value (pair of values) of the regularization parameter(s).
  • AlgoName_pred.m: function for predicting the labels on a data set given the model returned by AlgoName_kcv.m

There is a "utilities" folder included, and 2 demos:

  • demo_kcv.m: showing how to perform cross-validation with l1, l1l2, and group lasso.
  • demo_glopridu_kcv.m: showing how to perform cross-validation with group lasso with overlap.

For usage details, type "help FUN_NAME" (e.g. "help l1l2_kcv") at the MATLAB prompt.

Download and Install

DOWNLOAD HERE

For install, download the zip file, uncompress it in a appropriate directory ("DirName"), and add the toolbox directory and all sub-directory in the Matlab path:

addpath(genpath('DirName/PASPAL'))

Copyright Notice

Copyright 2010, Sofia Mosci and Lorenzo Rosasco

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License, Version 3, 29 June 2007, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

If you use PASPAL in your research, please cite [1]:

References

  1. S. Mosci, L. Rosasco, M. Santoro, A. Verri and S. Villa;
    Solving Structured Sparsity Regularization with Proximal Methods;
    in J.L. Balcazar, F. Bonchi, A. Gionis and M. Sebag (Eds.), Machine Learning and Knowledge Discovery in Databases,
    European Conference, ECML PKDD 2010, Barcelona, Spain, September 20-24, 2010, Proceedings, Part II pp 418.
  2. S. Mosci, S. Villa, A. Verri, L. Rosasco;
    A primal-dual algorithm for group sparse regularization with overlapping groups;
    To be presented at NIPS 2010.
  3. S. Mosci, L. Rosasco, M. Santoro, A. Verri and S. Villa;
    A new algorithm to learn an optimal kernel based on Fenchel duality;
    NIPS workshop "Kernel Learning: Automatic Selection of Optimal Kernel", 13th December 2008, Whistler, BC. Canada.
  4. M. Santoro, L. Rosasco, S. Villa, S. Mosci and A. Verri;
    Simple algorithms to solve sparsity based regularization via Fenchel duality;
    NIPS workshop "OPT08: Optimization for Machine Learning", 12th December 2008, Whistler, BC. Canada.