|
PASPAL: Proximal Algorithms for SPArse Learning (MATLAB code) Sofia Mosci and Lorenzo Rosasco Last updated October 27 2010. |
|
Content |
|
This set of MATLAB toolboxes contain an implementation of the learning algorithms described in [1-4]. The considered algorithms are different regularization approaches to structured sparsity, where the empirical least squares error for a linear model (y = x &beta) is penalized with a sparsity enforcing penalty. In particular we considered the following penalties (see [1,2] and references therein for details): Lasso or l1 penalty: the sum of the absolute values of the regression coeffcients &beta. It enforces the solution to depend on a (small) subset of variables. (Naive) elastic net or l1-l2 penalty: combination of the l1 and the l2 norm (sum of the squared values) of the regression coeffcients. By varying a suitable "smoothness" parameter l1l2-regularization allows trading sparsity for the inclusion of correlated variables. In fact, while the l1 penalty enforces the solution to depend on a (small) subset of variables, the l2 norm maintains correlation by forcing all variables correlated to the relevant ones, to be selected as well. Group Lasso penalty: given a group partition of the input variables, the group lasso penalty also known as l1/l2 penalty is the sum (i.e. the l1 norm) of the l2 norms of the regression coefficents restricted to each group. It enforces the support of the solution to be the union of a subset of the groups. Group Lasso with overlap penalty: given a set of possibly overlapping groups of the input variables, the group lasso with overlap penalty generalizes the standard group lasso penalty to overlapping groups, in that it forces the support of the solution to be the union of a subset of the groups. The implementation is based on proximal methods, in particular on the a la Nesterov optimization procedure that was independently proposed by Guler (1992) and Beck and Teboulle (2009) with the name FISTA. Such a proximal method is an iterative procedure, which convergence rate is guaranteed. When computing the regularization path, i.e. the set of solutions corresponding to different values of the regularization parameter (&tau_1<&tau_2<...<&tau_max), we employ the continuation strategy proposed in Hale Yin and Zhang (2008): the first computed solution is the one corresponding to &tau_max, then solutions for decreasing values of &tau are computed using the solution for &tau_t as initialization for computing the solution for to &tau_t-1. |
|
The toolboxes are:
Each toolbox contains:
There is a "utilities" folder included, and 2 demos:
For usage details, type "help FUN_NAME" (e.g. "help l1l2_kcv") at the MATLAB prompt. |
|
DOWNLOAD HERE For install, download the zip file, uncompress it in a appropriate directory ("DirName"), and add the toolbox directory and all sub-directory in the Matlab path: addpath(genpath('DirName/PASPAL')) |
|
Copyright 2010, Sofia Mosci and Lorenzo Rosasco This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License, Version 3, 29 June 2007, as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/. If you use PASPAL in your research, please cite [1]: |
|