Using Graphs and Semantic Information to Improve Text Classifiers. - Citegraph

Paper Info

Title
Using Graphs and Semantic Information to Improve Text Classifiers.

Abstract
Text classification using semantic information is the latest trend of research due to its greater potential to accurately represent text content compared with bag-of-words (BOW) approaches. On the other hand, representation of semantics through graphs has several advantages over the traditional representation of feature vector. Therefore, error tolerant graph matching techniques can be used for text classification. Nevertheless, very few methodologies exist in the literature which use semantic representation through graphs. In the present work, a methodology has been proposed to represent semantic information from a summarized text into a graph. The discourse representation structure of a text is utilized in order to represent its semantic content and, afterwards, it is transformed into a graph. Five different graph matching techniques based on Maximum Common Subgraphs (mcs) and Minimum Common Supergraphs (MCS) are evaluated on 20 classes from the Reuters dataset taking 10 docs of each class for both training and testing purposes using the k-NN classifier. From the results it can be observed that the technique has potential to perform text classification as well as the traditional BOW approaches. Moreover a majority voting based combination of the semantic representation and a traditional BOW approach provided an improved recognition accuracy on the same data set.

Year	DOI	Venue
2014	10.1007/978-3-319-10888-9_33	ADVANCES IN NATURAL LANGUAGE PROCESSING
Field	DocType	Volume
Graph kernel,Text graph,Feature vector,Computer science,Distance,Explicit semantic analysis,Matching (graph theory),Artificial intelligence,Classifier (linguistics),Semantics,Machine learning	Conference	8686
ISSN	Citations	PageRank
0302-9743	1	0.35
References	Authors
8	4

Authors (4 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Nibaran Das	1	391	40.72
Swarnendu Ghosh	2	20	5.37
Teresa Gonçalves	3	14	1.24
Paulo Quaresma	4	18	2.73

1