Analysis of Biomedical Text for Chemical Names: A Comparison of Three Methods - Citegraph

Paper Info

Title
Analysis of Biomedical Text for Chemical Names: A Comparison of Three Methods

Abstract
At the National Library of Medicine (NLM), a variety of biomedical vocabularies are found in data pertinent to its mission. In addition to standard medical terminology, there are specialized vocabularies including that of chemical nomenclature. Normal language tools including the lexically based ones used by the Unified Medical Language System (R) (UMLS (R)) to manipulate and normalize text do not work well on chemical nomenclature. In order to improve NLM's capabilities in chemical text processing, two approaches to the problem of recognizing chemical nomenclature were explored The first approach was a lexical one and consisted of analyzing text for the presence of a fixed set of chemical segments. The approach was extended with general chemical patterns and also with terms from NLM's indexing vocabulary, MeSH (R) and the NLM SPECIALIST (TM) lexicon. The second approach applied Bayesian classification to n-grams of text via two different methods. The single lexical method and two statistical methods, were tested against data from the 1999 UMLS Metathesaurus (R). One of the statistical methods had an overall classification accuracy of 97%.

Year	Venue	Field
1999	JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION	Medical terminology,Information retrieval,Naive Bayes classifier,Chemical nomenclature,Computer science,Search engine indexing,Lexicon,Natural language processing,Artificial intelligence,Vocabulary,Unified Medical Language System,Text processing
DocType	Issue	ISSN
Conference	SUPnan	1067-5027
Citations	PageRank	References
32	7.06	4
Authors
6

Authors (6 rows)

Cited by (32 rows)

References (4 rows)

Name	Order	Citations	PageRank
W. John Wilbur	1	430	45.66
George F. Hazard Jr.	2	32	7.06
Guy Divita	3	138	24.59
James G. Mork	4	647	65.22
Alan R. Aronson	5	2551	260.67
Allen C. Browne	6	184	32.81

1