Title
Clustering formal concepts to discover biologically relevant knowledge from gene expression data.
Abstract
The production of high-throughput gene expression data has generated a crucial need for bioinformatics tools to generate biologically interesting hypotheses. Whereas many tools are available for extracting global patterns, less attention has been focused on local pattern discovery. We propose here an original way to discover knowledge from gene expression data by means of the so-called formal concepts which hold in derived Boolean gene expression datasets. We first encoded the over-expression properties of genes in human cells using human SAGE data. It has given rise to a Boolean matrix from which we extracted the complete collection of formal concepts, i.e., all the largest sets of over-expressed genes associated to a largest set of biological situations in which their over-expression is observed. Complete collections of such patterns tend to be huge. Since their interpretation is a time-consuming task, we propose a new method to rapidly visualize clusters of formal concepts. This designates a reasonable number of Quasi-Synexpression-Groups (QSGs) for further analysis. The interest of our approach is illustrated using human SAGE data and interpreting one of the extracted QSGs. The assessment of its biological relevancy leads to the formulation of both previously proposed and new biological hypotheses.
Year
Venue
Keywords
2007
In Silico Biology
formal concepts,pattern discovery,sage,closed sets,transcriptome,clustering,high throughput,gene expression
Field
DocType
Volume
Logical matrix,Computer science,Gene expression,Closed set,Bioinformatics,Cluster analysis
Journal
7
Issue
ISSN
Citations 
4-5
1386-6338
19
PageRank 
References 
Authors
1.35
20
7
Name
Order
Citations
PageRank
Sylvain Blachon1825.07
Ruggero G. Pensa235431.20
Jérémy Besson340724.00
Céline Robardet469660.36
Jean-François Boulicaut51162102.38
Olivier Gandrillon617612.53
bâtiment blaise pascal7241.81