Home | Search | Help  
Home Page Università di Genova

Seminar Details

Date 12-9-2006
Time 16:00
Room/Location DISI- Aula 711- 7 piano
Title Clustering, Subspace Clustering and Clustering Ensembles
Speaker Dott. Carlotta Domeniconi
Affiliation Department of Information and Software Engineering, George Mason University
Link http://ise.gmu.edu/~carlotta/
Abstract Clustering suffers from the curse of dimensionality, and similarity functions that use all input features with equal relevance may not be effective. In the first part of the talk we introduce an algorithm that discovers clusters in subspaces spanned by different combinations of dimensions via local weightings of features. Our approach avoids the risk of loss of information encountered in global dimensionality reduction techniques, and does not assume any data distribution model. Our method associates to each cluster a weight vector, whose values capture the relevance of features within the corresponding cluster. We experimentally demonstrate the gain in performance our method achieves with respect to competitive methods. In particular, we apply our technique to clustering of documents, where cluster-dependent keywords are also identified via the continuous term-weighting mechanism. In the second part of the talk we investigate the sensitivity of subspace clustering to input parameters, and propose a clustering ensemble approach to solve this problem. Cluster ensembles can provide robust and stable solutions by leveraging the consensus across multiple clustering results, while averaging out emergent spurious structures that arise due to the various biases to which each participating algorithm is tuned. Experimental results show that our ensemble techniques are capable of producing a partition that is as good or better than the best individual clustering. Bio: Carlotta Domeniconi received the Laurea Degree in computer science from the University of Milano, Milan, Italy, in 1992, the M.S. degree in information and communication technologies from the International Institute for Advanced Scientific Studies, Salerno, Italy, in 1997, and the Ph.D. degree in computer science from the University of California, Riverside, in 2002. She is currently an Assistant Professor in the Information and Software Engineering Department at George Mason University. Her research interests include machine learning, pattern recognition, data mining, and feature relevance estimation, with applications in text mining and bioinformatics. Dr. Domeniconi is a recipient of a 2004 Ralph E. Powe Junior Faculty Enhancement Award. Her research is in part supported by an NSF CAREER Award and a grant from the U.S. Army.
Back to Seminars