Title
Large-scale annotation of small-molecule libraries using public databases.
Abstract
While many large publicly accessible databases provide excellent annotation for biological macromolecules, the same is not true for small chemical compounds. Commercial data sources also fail to encompass an annotation interface for large numbers of compounds and tend to be cost prohibitive to be widely available to biomedical researchers. Therefore, using annotation information for the selection of lead compounds from a modern day high-throughput screening (HTS) campaign presently occurs only under a very limited scale. The recent rapid expansion of the NIH PubChem database provides an opportunity to link existing biological databases with compound catalogs and provides relevant information that potentially could improve the information garnered from large-scale screening efforts. Using the 2.5 million compound collection at the Genomics Institute of the Novartis Research Foundation (GNF) as a model, we determined that similar to 4% of the library contained compounds with potential annotation in such databases as PubChem and the World Drug Index (WDI) as well as related databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and ChemIDplus. Furthermore, the exact structure match analysis showed 32% of GNF compounds can be linked to third party databases via PubChem. We also showed annotations such as MeSH (medical subject headings) terms can be applied to in-house HTS databases in identifying signature biological inhibition profiles of interest as well as expediting the assay validation process. The automated annotation of thousands of screening hits in batch is becoming feasible and has the potential to play an essential role in the hit-to-lead decision making process.
Year
DOI
Venue
2007
10.1021/ci700092v
JOURNAL OF CHEMICAL INFORMATION AND MODELING
DocType
Volume
Issue
Journal
47
4
ISSN
Citations 
PageRank 
1549-9596
7
3.96
References 
Authors
1
7
Name
Order
Citations
PageRank
Yingyao Zhou19817.55
Bin Zhou25416.55
Kai-Sheng Chen3308.16
S. Frank Yan4458.43
Frederick J. King5116.97
Shumei Jiang674.29
Elizabeth A. Winzeler7436.44