Title
ChemSchematicResolver: A Toolkit to Decode 2-D Chemical Diagrams with Labels and R-groups into Annotated Chemical Named Entities.
Abstract
The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry, these figures often present chemical schematic diagrams that graphically define the structures of carbon-based compounds. These diagrams are intuitive for an expert to comprehend, but they are not designed for machines. This work presents ChemSchematicResolver, a software tool that can be used to identify chemical schematic diagrams within the figure of a document, resolve any R-group substituents within them, and convert the resulting diagrams to a machine-readable format in a high-throughput, autonomous fashion. The tool includes a new algorithm that is used to identify relevant diagrams and a mechanism that combines these data with contextual information from the rest of the document for the creation of highly relational databases. It includes support for a variety of general R-group structures, the first time this is available in any opensource chemical schematic diagram extraction tool. It is presented alongside a self-generated evaluation set, on which the most important assessment metric, precision, achieved 83-100% for all assessed areas. The ChemSchematicResolver tool is released under the MIT license and is available to download from www.chemschematicresolver.org.
Year
DOI
Venue
2020
10.1021/acs.jcim.0c00042
JOURNAL OF CHEMICAL INFORMATION AND MODELING
DocType
Volume
Issue
Journal
60
4
ISSN
Citations 
PageRank 
1549-9596
2
0.38
References 
Authors
0
2
Name
Order
Citations
PageRank
Edward Beard120.38
Jacqueline M Cole253.15