Seminar Details
| Date |
12-9-2006 |
| Time |
16:00 |
| Room/Location |
DISI- Aula 711- 7 ° piano |
| Title |
Clustering, Subspace Clustering and Clustering Ensembles |
| Speaker |
Dott. Carlotta Domeniconi |
| Affiliation |
Department of Information and Software Engineering, George Mason University |
| Link |
http://ise.gmu.edu/~carlotta/
|
| Abstract |
Clustering suffers from the curse of dimensionality,
and similarity functions that use all input features
with equal relevance may not be effective.
In the first part of the talk we introduce an algorithm
that discovers clusters in subspaces spanned by different
combinations of dimensions via local weightings of features.
Our approach avoids the risk of loss of information encountered
in global dimensionality reduction techniques, and does not
assume any data distribution model. Our method associates to
each cluster a weight vector, whose values capture the relevance
of features within the corresponding cluster. We experimentally
demonstrate the gain in performance our method achieves with
respect to competitive methods. In particular, we apply our
technique to clustering of documents, where cluster-dependent
keywords are also identified via the continuous term-weighting
mechanism.
In the second part of the talk we investigate the sensitivity
of subspace clustering to input parameters, and propose
a clustering ensemble approach to solve this problem.
Cluster ensembles can provide robust and stable solutions
by leveraging the consensus across multiple clustering results,
while averaging out emergent spurious structures that arise due
to the various biases to which each participating algorithm
is tuned. Experimental results show that our ensemble techniques
are capable of producing a partition that is as good or better
than the best individual clustering.
Bio:
Carlotta Domeniconi received the Laurea Degree in computer science
from the University of Milano, Milan, Italy, in 1992, the M.S. degree
in information and communication technologies from the International
Institute for Advanced Scientific Studies, Salerno, Italy, in 1997, and
the Ph.D. degree in computer science from the University of California,
Riverside, in 2002.
She is currently an Assistant Professor in the Information and Software
Engineering Department at George Mason University. Her research interests
include machine learning, pattern recognition, data mining, and feature
relevance estimation, with applications in text mining and bioinformatics.
Dr. Domeniconi is a recipient of a 2004 Ralph E. Powe Junior Faculty
Enhancement Award. Her research is in part supported by an NSF CAREER Award
and a grant from the U.S. Army. |
|
|
 |