PuSH - Publication Server of Helmholtz Zentrum München

Finding the optimal subspace for clustering.

In: Proceedings (IEEE International Conference on Data Mining (ICDM), 14 - 17 December 2014, Shenzhen, China). 2014. 130-139
DOI
as soon as is submitted to ZB.
The ability to simplify and categorize things is one of the most important elements of human thought, understanding, and learning. The corresponding explorative data analysis techniques -- dimensionality reduction and clustering -- have initially been studied by our community as two separate research topics. Later algorithms like CLIQUE, ORCLUS, 4C, etc. Performed clustering and dimensionality reduction in a joint, alternating process to find clusters residing in low-dimensional subspaces. Such a low-dimensional representation is extremely useful, because it allows us to visualize the relationships between the various objects of a cluster. However, previous methods of subspace, correlation or projected clustering determine an individual subspace for each cluster. In this paper, we demonstrate that it is even much more valuable to find clusters in one common low-dimensional subspace, because then we can study not only the intra-cluster but also the inter-cluster relationships of objects, and the relationships of the whole clusters to each other. We develop the mathematical foundation ORT (Optimal Rigid Transform) to determine an arbitrarily-oriented subspace, suitable for a given cluster structure. Based on ORT, we propose FOSSCLU (Finding the Optimal Sub Space for Clustering), a new iterative clustering algorithm. Our extensive experiments demonstrate that FOSSCLU outperforms the previous methods even in both aspects: clustering and dimensionality reduction.
Altmetric
Additional Metrics?
Edit extra informations Login
Publication type Article: Conference contribution
Reviewing status