Title
Classifying the precancers: a metadata approach.
Abstract
During carcinogenesis, precancers are the morphologically identifiable lesions that precede invasive cancers. In theory, the successful treatment of precancers would result in the eradication of most human cancers. Despite the importance of these lesions, there has been no effort to list and classify all of the precancers. The purpose of this study is to describe the first comprehensive taxonomy and classification of the precancers. As a novel approach to disease classification, terms and classes were annotated with metadata (data that describes the data) so that the classification could be used to link precancer terms to data elements in other biological databases.Terms in the UMLS (Unified Medical Language System) related to precancers were extracted. Extracted terms were reviewed and additional terms added. Each precancer was assigned one of six general classes. The entire classification was assembled as an XML (eXtensible Mark-up Language) file. A Perl script converted the XML file into a browser-viewable HTML (HyperText Mark-up Language) file.The classification contained 4700 precancer terms, 568 distinct precancer concepts and six precancer classes: 1) Acquired microscopic precancers; 2) acquired large lesions with microscopic atypia; 3) Precursor lesions occurring with inherited hyperplastic syndromes that progress to cancer; 4) Acquired diffuse hyperplasias and diffuse metaplasias; 5) Currently unclassified entities; and 6) Superclass and modifiers.This work represents the first attempt to create a comprehensive listing of the precancers, the first attempt to classify precancers by their biological properties and the first attempt to create a pathologic classification of precancers using standard metadata (XML). The classification is placed in the public domain, and comment is invited by the authors, who are prepared to curate and modify the classification.
Year
DOI
Venue
2003
10.1186/1472-6947-3-8
BMC Med. Inf. & Decision Making
Keywords
Field
DocType
public domain,health informatics,internet,biological database,programming languages,unified medical language system,medical informatics
Disease classification,Metadata,Data mining,Information retrieval,Computer science,Biological database,Unified Medical Language System
Journal
Volume
Issue
ISSN
3
1
1472-6947
Citations 
PageRank 
References 
4
0.48
0
Authors
2
Name
Order
Citations
PageRank
Jules J. Berman19014.14
Donald E Henson271.42