how intelligence works and how it can be emulated in machines has been
an elusive problem for decades and it is arguably one of the biggest
challenges in modern science. Learning, its principles, and
computational implementations are at the very core of this endeavor.
Only recently we have been able, for the first time, to
develop artificial intelligence systems able to
solve complex tasks that were considered out of reach for several
decades. Modern cameras can recognize faces, and smart phones recognize
people voice, car provided with cameras can detect pedestrians and ATM
machines can automatically read checks. In most cases at the
root of these success stories there are machine learning algorithms,
that is, softwares that are trained rather than programmed to
solve a task.
Among the variety of
approaches and ideas in modern computational learning, we focus on
a core class of methods, namely regularization methods, which
represent a fundamental set of concepts and
techniques allowing to treat in a unified way a
huge class of diverse approaches, while providing the tools
to design new ones. Starting from classical notions of smoothness,
shrinkage and margin, we will cover state of the art
techniques based on the concepts of geometry (e.g. manifold
learning), sparsity, low rank, allowing to design algorithm for tasks
such as supervised learning, feature selection, structured
prediction, multitask learning and model selection.
Practical applications will be discussed, primarily from the field of
The classes will focus on
algorithmic and methodological aspects, while trying to give an idea of
the underlying theoretical underpinnings.
of the classes will be posted on this website and scribes of most
classes, as well as other material, can be found on the 9.520 course webpage
The course will cover the following topics:
- Introduction to machine learning - first part and second part
- Reproducing Kernel Hilbert Spaces and Tikhonov Regularization - slides
- Regularized Least Squares and Support Vector Machines - slides
- Spectral methods for supervised learning - slides
- Sparsity-based learning - slides
- Multiple Kernel Learning - slides
- Manifold regularization - slides
- Multitask learning - slides
O., S. Boucheron and G. Lugosi. Introduction to Statistical Learning
Theory. Advanced Lectures on Machine Learning Lecture Notes in
Artificial Intelligence 3176, 169-207. (Eds.) Heidelberg, Germany (2004)
- F. Cucker and S. Smale. On The Mathematical Foundations of
Learning. Bulletin of the American Mathematical Society, 2002.
- T. Evgeniou and M. Pontil and T. Poggio. Regularization
and Support Vector Machines. Advances in Computational Mathematics,
- T. Poggio and S. Smale. The Mathematics of Learning: Dealing
with Data. Notices of the AMS, 2003
- L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory
of Pattern Recognition. Springer, 1997.
- V. N. Vapnik. Statistical Learning Theory. Wiley, 1998.
- T. Hastie, R. Tibshirani, J. H. Friedman. The Elements of
Statistical Learning, Springer 2001.
- I. Steinwart
and A. Christmann. Support vector machines. Springer, New York, 2008.
Felipe; Zhou, Ding-Xuan Learning theory: an approximation theory
With a foreword by Stephen Smale.
Cambridge Monographs on Applied and Computational Mathematics.
University Press, Cambridge, 2007. xii+224 pp.
To pass the exam you have to gain a total of at least 20 points.
Extra points may buy you better a grade.
We envision three possibilities:
Deadlines. There are two submission dates: 30/6/2012 and 31/12/2012. Submitted material will be corrected only after those two dates.
- Solve problems from the proposed set (46 possible point ) for a total of 10/46 points.
write a short report 2 pages+ references on one of the subject treated in class.
Write a report 6 pages+ references on one of the subject treated in class.
Solve problems from the proposed set (46 possible point ) for a total of 20/46 points.
Notice: problems may be solved in groups (mention your co-workers in your submission) while reports are supposed to be individual.