Regularization Methods for High Dimensional Learning

This year the course will be held at the Bertinoro International Spring School (BiSS2012) on the 12-16 March 2012

Course at a Glance

The course covers the foundations as well as the recent advances in Computational Learning with particular emphasis on the analysis of high dimensional data and focusing on a set of core techniques, namely regularization methods. See the synopsis and the syllabus for more details

Instructors

Francesca Odone -- University of Genova -- francesca.odone@unige.it
Lorenzo Rosasco -- IIT, MIT and University of Genova -- lrosasco@mit.edu

Synopsis

Understanding how intelligence works and how it can be emulated in machines has been an elusive problem for decades and it is arguably one of the biggest challenges in modern science. Learning, its principles, and computational implementations are at the very core of this endeavor. Only recently we have been able, for the first time, to develop artificial intelligence systems able to solve complex tasks that were considered out of reach for several decades. Modern cameras can recognize faces, and smart phones recognize people voice, car provided with cameras can detect pedestrians and ATM machines can automatically read checks. In most cases at the root of these success stories there are machine learning algorithms, that is, softwares that are trained rather than programmed to solve a task.

Among the variety of approaches and ideas in modern computational learning, we focus on a core class of methods, namely regularization methods, which represent a fundamental set of concepts and techniques allowing to treat in a unified way a huge class of diverse approaches, while providing the tools to design new ones. Starting from classical notions of smoothness, shrinkage and margin, we will cover state of the art techniques based on the concepts of geometry (e.g. manifold learning), sparsity, low rank, allowing to design algorithm for tasks such as supervised learning, feature selection, structured prediction, multitask learning and model selection.
Practical applications will be discussed, primarily from the field of computational vision.

The classes will focus on algorithmic and methodological aspects, while trying to give an idea of the underlying theoretical underpinnings.

Slides of the classes will be posted on this website and scribes of most classes, as well as other material, can be found on the 9.520 course webpage at MIT.

Syllabus

The course will cover the following topics:

Introduction to machine learning - first part and second part

Reproducing Kernel Hilbert Spaces and Tikhonov Regularization - slides

Regularized Least Squares and Support Vector Machines - slides

Spectral methods for supervised learning - slides

Sparsity-based learning - slides

Multiple Kernel Learning - slides

Manifold regularization - slides

Multitask learning - slides

General references are

Bousquet, O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning Theory. Advanced Lectures on Machine Learning Lecture Notes in Artificial Intelligence 3176, 169-207. (Eds.) Heidelberg, Germany (2004)
F. Cucker and S. Smale. On The Mathematical Foundations of Learning. Bulletin of the American Mathematical Society, 2002.
T. Evgeniou and M. Pontil and T. Poggio. Regularization Networks and Support Vector Machines. Advances in Computational Mathematics, 2000.
T. Poggio and S. Smale. The Mathematics of Learning: Dealing with Data. Notices of the AMS, 2003
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, 1997.
V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
T. Hastie, R. Tibshirani, J. H. Friedman. The Elements of Statistical Learning, Springer 2001.
I. Steinwart and A. Christmann. Support vector machines. Springer, New York, 2008.
Cucker, Felipe; Zhou, Ding-Xuan Learning theory: an approximation theory viewpoint.
With a foreword by Stephen Smale. Cambridge Monographs on Applied and Computational Mathematics.
Cambridge University Press, Cambridge, 2007. xii+224 pp.

Exam

To pass the exam you have to gain a total of at least 20 points. Extra points may buy you better a grade. We envision three possibilities:

Solve problems from the proposed set (46 possible point ) for a total of 10/46 points. AND write a short report 2 pages+ references on one of the subject treated in class.

Write a report 6 pages+ references on one of the subject treated in class.

Solve problems from the proposed set (46 possible point ) for a total of 20/46 points.

Deadlines. There are two submission dates: 30/6/2012 and 31/12/2012. Submitted material will be corrected only after those two dates.

Notice: problems may be solved in groups (mention your co-workers in your submission) while reports are supposed to be individual.