Title
Characterization of the Chemical Space of Known and Readily Obtainable Natural Products.
Abstract
Natural products remain one of the most productive sources of chemical inspiration for the development of new drugs. The structures of more than 250 000 natural products are available from public databases. At least 10% of these compounds are readily obtainable for experimental testing from commercial vendors and public research institutions. While the physicochemical properties of known natural products have been thoroughly studied and compared to those of drugs and other types of small molecules, the information available on the content, coverage, and relevance of individual virtual and physical natural product libraries is clearly limited. The aim of this study was the development of a detailed understanding of the coverage of chemical space by known and readily obtainable natural products and by individual natural product databases. For this purpose, we compiled comprehensive data sets of known and readily obtainable natural products from 18 virtual databases (including the Dictionary of Natural Products), nine physical libraries, and the Protein Data Bank (PDB). We also developed and employed an algorithm ("SugarBuster") for the removal of sugars and sugar-like moieties, which are generally not in the focus of interest for drug discovery, from natural products. In addition, we devised a rule-based approach for the automated classification of natural products into natural product classes (alkaloids, steroids, flavonoids, etc.). Among the most important results of this study is the finding that the readily obtainable natural products are highly diverse and populate regions of chemical space that are of high relevance to drug discovery. In some cases, substantial differences in the coverage of natural product classes and chemical space by the individual databases are observed. More than 2000 natural products are identified for which at least one X-ray crystal structure of the compound in complex with a biomacromolecule is available from the PDB.
Year
DOI
Venue
2018
10.1021/acs.jcim.8b00302
JOURNAL OF CHEMICAL INFORMATION AND MODELING
DocType
Volume
Issue
Journal
58
8
ISSN
Citations 
PageRank 
1549-9596
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Ya Chen100.34
Marina Garcia de Lomana200.34
Nils-Ole Friedrich352.06
Johannes Kirchmair421642.23