Title
Optical Structure Recognition Software To Recover Chemical Information: OSRA, An Open Source Solution.
Abstract
Until recently most scientific and patent documents dealing with chemistry have described molecular structures either with systematic names or with graphical images of Kekule structures. The latter method poses inherent problems in the automated processing that is needed when the number of documents ranges in the hundreds of thousands or even millions since graphical representations cannot be directly interpreted by a computer. To recover this structural information, which is otherwise all but lost, we have built an optical structure recognition application based on modern advances in image processing implemented in open source tools, OSRA. OSRA can read documents in over 90 graphical formats including GIF, JPEG, PNG, TIFF, PDF, and PS, automatically recognizes and extracts the graphical information representing chemical structures in such documents, and generates the SMILES or SD representation of the encountered molecular structure images.
Year
DOI
Venue
2009
10.1021/ci800067r
JOURNAL OF CHEMICAL INFORMATION AND MODELING
Field
DocType
Volume
Computer science,Image processing,JPEG,Software,Bioinformatics,Computer graphics,Structure recognition
Journal
49
Issue
ISSN
Citations 
3
1549-9596
37
PageRank 
References 
Authors
2.59
0
2
Name
Order
Citations
PageRank
Igor V. Filippov1747.42
Marc C. Nicklaus218630.38