Title
Exploiting unlabeled data for question classification
Abstract
In this paper, we introduce a kernel-based approach to question classification. We employed a kernel function based on latent semantic information acquired from Wikipedia. This kernel allows including external semantic knowledge into the supervised learning process.We obtained a highly effective question classifier combining this knowledge with a bag-of-words approach by means of composite kernels. As the semantic information is acquired from unlabeled text, our system can be easily adapted to different languages and domains. We tested it on a parallel corpus of English and Spanish questions.
Year
DOI
Venue
2011
10.1007/978-3-642-22327-3_13
NLDB
Keywords
Field
DocType
question classification,external semantic knowledge,effective question classifier,bag-of-words approach,parallel corpus,composite kernel,different language,spanish question,semantic information,unlabeled data,kernel-based approach,latent semantic information,semi supervised learning,kernel methods
Semantic memory,Kernel (linear algebra),Semi-supervised learning,Computer science,Supervised learning,Tree kernel,Natural language processing,Artificial intelligence,Kernel method,Classifier (linguistics),Machine learning,Kernel (statistics)
Conference
Volume
ISSN
Citations 
6716
0302-9743
0
PageRank 
References 
Authors
0.34
11
2
Name
Order
Citations
PageRank
David Tomás1123.82
Claudio Giuliano248833.00