Title | ||
---|---|---|
Biomedical text categorization with concept graph representations using a controlled vocabulary |
Abstract | ||
---|---|---|
Recent work using graph representations for text categorization has shown promising performance over conventional bag-of-words representation of text documents. In this paper we investigate a graph representation of texts for the task of text categorization. In our representation we identify high level concepts extracted from a database of controlled biomedical terms and build a rich graph structure that contains important concepts and relationships. This procedure ensures that graphs are described with a regular vocabulary, leading to increased ease of comparison. We then classify document graphs by applying a set-based graph kernel that is intuitively sensible and able to deal with the disconnectedness of the constructed concept graphs. We compare this approach to standard approaches using non-graph, text-based features. We also do a comparison amongst different kernels that can be used to see which performs better. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1145/2350176.2350181 | BIOKDD |
Keywords | Field | DocType |
biomedical text categorization,controlled vocabulary,set-based graph kernel,text document,document graph,text categorization,concept graph representation,graph representation,conventional bag-of-words representation,rich graph structure,biomedical term,concept graph,different kernel,bag of words,biomedical informatics | Graph kernel,Text graph,Graph database,Computer science,Controlled vocabulary,Artificial intelligence,Natural language processing,Graph rewriting,Text categorization,Vocabulary,Graph (abstract data type) | Conference |
Citations | PageRank | References |
6 | 0.52 | 16 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Meenakshi Mishra | 1 | 36 | 4.18 |
Jun Huan | 2 | 1211 | 81.09 |
Said Bleik | 3 | 7 | 1.23 |
Min Song | 4 | 17 | 3.46 |