Title
Biomedical text categorization with concept graph representations using a controlled vocabulary
Abstract
Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorization. In our representation we identify high level concepts extracted from a database of controlled biomedical terms and build a rich graph structure that contains important concepts and relationships. This procedure ensures that graphs are described with a regular vocabulary, leading to increased ease of comparison. We then classify document graphs by applying a set-based graph kernel that is intuitively sensible and able to deal with the disconnectedness of the constructed concept graphs. We compare this approach to standard approaches using non-graph, text-based features. We also do a comparison amongst different kernels that can be used to see which performs better.
Year
DOI
Venue
2012
10.1145/2350176.2350181
BIOKDD
Keywords
Field
DocType
biomedical text categorization,controlled vocabulary,set-based graph kernel,text document,document graph,text categorization,concept graph representation,graph representation,conventional bag-of-words representation,rich graph structure,biomedical term,concept graph,different kernel,bag of words,biomedical informatics
Graph kernel,Text graph,Graph database,Computer science,Controlled vocabulary,Artificial intelligence,Natural language processing,Graph rewriting,Text categorization,Vocabulary,Graph (abstract data type)
Conference
Citations 
PageRank 
References 
6
0.52
16
Authors
4
Name
Order
Citations
PageRank
Meenakshi Mishra1364.18
Jun Huan2121181.09
Said Bleik371.23
Min Song4173.46