Mining a chemical database for fragment co-occurrence: discovery of "chemical clichés". - Citegraph

Paper Info

Title
Mining a chemical database for fragment co-occurrence: discovery of "chemical clichés".

Abstract
Nowadays millions of different compounds are known, their structures stored in electronic databases. Analysis of these data could yield valuable insights into the laws of chemistry and the habits of chemists. We have therefore explored the public database of the National Cancer Institute (> 250 000 compounds) by pattern searching. We split the molecules of this database into fragments to find out which fragments exist, how frequent they are, and whether the occurrence of one fragment in a molecule is related to the occurrence of another, nonoverlapping fragment. It turns out that some fragments and combinations of fragments are so frequent that they can be called "chemical cliches". We believe that the fragment data can give insight into the chemical space explored so far by synthesis. The lists of fragments and their (co-)occurrences can help create novel chemical compounds by (i) systematically listing the most popular and therefore most easily used substituents and ring systems for synthesizing new compounds, (ii) being an easily accessible repository for rarer fragments Suitable for lead compound optimization, and (iii) pointing out some of the yet unexplored parts of chemical space.

Year	DOI	Venue
2006	10.1021/ci050370c	JOURNAL OF CHEMICAL INFORMATION AND MODELING
Field	DocType	Volume
Molecule,Combinatorial chemistry,Chemistry,Co-occurrence,Chemical space,Bioinformatics,Chemical database	Journal	46
Issue	ISSN	Citations
2	1549-9596	6
PageRank	References	Authors
0.60	4	4

Authors (4 rows)

Cited by (6 rows)

References (4 rows)

Name	Order	Citations	PageRank
Eric-Wubbo Lameijer	1	37	5.84
Joost N. Kok	2	1429	121.49
Thomas Bäck	3	629	86.94
Adriaan P IJzerman	4	190	18.68

1