Given a gene expression data matrix where each cell is the expression level of a gene under a certain condition, biclustering is the problem of searching for a subset of genes that co regulate and co express only under a subset of conditions. As some genes can belong to different functional categories, searching for non-redundant overlapping biclusters is an important problem in biclustering. However, most recent algorithms can only either produce disjoint biclusters or redundant biclusters with significant overlap. In other words, these algorithms do not allow users to specify the maximum overlap between the biclusters. In this paper, we propose a novel algorithm which can generate K overlapping biclusters where the maximum overlap between them is below a predefined threshold. Unlike the other approaches which often generate all biclusters at once, our algorithm produces the biclusters sequentially, where each newly generated bicluster is guaranteed to be different from the previous ones but can still overlap with them. The experiments on real datasets confirm that different meaningful overlapping biclusters are successfully discovered. Besides, under the same constraints, our algorithm returns much larger and higher-quality biclusters compared to those of the other state-of-the art algorithms.
Scheda prodotto non validato
I dati visualizzati non sono stati ancora sottoposti a validazione formale da parte dello Staff di IRIS, ma sono stati ugualmente trasmessi al Sito Docente Cineca (Loginmiur).
|Titolo:||Discovering Non-redundant Overlapping Biclusters on Gene Expression Data|
|Autori:||Duy Tin Truong; R. Battiti; M. Brunato|
|Titolo del volume contenente il saggio:||2013 IEEE 13th International Conference on Data Mining|
|Luogo di edizione:||Los Alamitos, CA|
|Casa editrice:||IEEE Computer Society CPS|
|Anno di pubblicazione:||2013|
|Codice identificativo Scopus:||2-s2.0-84894678228|
|Codice identificativo WOS:||WOS:000332874200076|
|Appare nelle tipologie:|