Given a dataset, traditional clustering algorithms often only provide a single partitioning or a single view of the dataset. On complex tasks, many different clusterings of a dataset exist, thus alternative clusterings which are of high quality and different from given trivial clusterings are asked to have complementary views. The task is therefore a clear multi-objective optimization problem. However, most approaches in the literature optimize these objectives sequentially (one after another one) or indirectly (by some heuristic combination). This can result in solutions which are not Pareto- optimal. The problem is even more difficult for high-dimensional datasets as clusters can be located in various subspaces of the original feature space. Besides, many practical applications require that subspace clusters can still overlap but the overlap must be below a predefined threshold. Nonetheless, most of the state-of-the-art subspace clustering algorithms can only generate a set of disjoint or significantly overlapping subspace clusters. To deal with the above issues, for full-space alternative clustering, we develop an algorithm which fully acknowledges the multiple objectives, optimizes them directly and simultaneously, and produces solutions approximating the Pareto front. As for non-redundant subspace clustering, we propose a general framework for generating K overlapping subspace clusters where the maximum overlap between them is guaranteed to be below a predefined threshold. In both cases, our algorithms can be applied for several domains as different analyzing models can be used without modifying the main parts of the algorithms.

Non-Redundant Overlapping Clustering: Algorithms and Applications / Truong, Duy Tin. - (2013), pp. 1-143.

Non-Redundant Overlapping Clustering: Algorithms and Applications

Truong, Duy Tin
2013-01-01

Abstract

Given a dataset, traditional clustering algorithms often only provide a single partitioning or a single view of the dataset. On complex tasks, many different clusterings of a dataset exist, thus alternative clusterings which are of high quality and different from given trivial clusterings are asked to have complementary views. The task is therefore a clear multi-objective optimization problem. However, most approaches in the literature optimize these objectives sequentially (one after another one) or indirectly (by some heuristic combination). This can result in solutions which are not Pareto- optimal. The problem is even more difficult for high-dimensional datasets as clusters can be located in various subspaces of the original feature space. Besides, many practical applications require that subspace clusters can still overlap but the overlap must be below a predefined threshold. Nonetheless, most of the state-of-the-art subspace clustering algorithms can only generate a set of disjoint or significantly overlapping subspace clusters. To deal with the above issues, for full-space alternative clustering, we develop an algorithm which fully acknowledges the multiple objectives, optimizes them directly and simultaneously, and produces solutions approximating the Pareto front. As for non-redundant subspace clustering, we propose a general framework for generating K overlapping subspace clusters where the maximum overlap between them is guaranteed to be below a predefined threshold. In both cases, our algorithms can be applied for several domains as different analyzing models can be used without modifying the main parts of the algorithms.
2013
XXVI
2012-2013
Ingegneria e scienza dell'Informaz (29/10/12-)
Information and Communication Technology
Battiti, Roberto
no
Inglese
Settore INF/01 - Informatica
File in questo prodotto:
File Dimensione Formato  
phdThesis.pdf

accesso aperto

Tipologia: Tesi di dottorato (Doctoral Thesis)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 2.74 MB
Formato Adobe PDF
2.74 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/368951
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact