Title
Document Type Classification in Online Digital Libraries.
Abstract
Online digital libraries make it easier for researchers to search for scientific information. They have been proven as powerful resources in many data mining, machine learning and information retrieval applications that require high-quality data. The quality of the data highly depends on the accuracy of classifiers that identify the types of documents that are crawled from the Web, e.g., as research papers, slides, books, etc., for appropriate indexing. These classifiers in turn depend on the choice of the feature representation. We propose novel features that result in high-accuracy classifiers for document type classification. Experimental results on several datasets show that our classifiers outperform models that are employed in current systems.
Year
Venue
Field
2016
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE
Information retrieval,Computer science,Search engine indexing,Supervised learning,Artificial intelligence,Information retrieval applications,Digital library,Machine learning,Document type definition
DocType
Citations 
PageRank 
Conference
1
0.35
References 
Authors
21
4
Name
Order
Citations
PageRank
Cornelia Caragea152053.61
Jian Wu2452.92
Sujatha Das Gollapalli360.78
C. Lee Giles4111541549.48