Title
Text Classification Based On Limited Bibliographic Metadata
Abstract
In this paper, we introduce a method for categorizing digital items according to their topic, only relying on the document's metadata, such as author name and title information. The proposed approach is based on a set of lexical resources constructed for our purposes (e.g., journal titles, conference names) and on a traditional machine-learning classifier that assigns one category to each document based on identified core features. The system is evaluated on a real-world data set and the influence of different feature combinations and settings is studied. Although the available information is limited, the results show that the approach is capable to efficiently classify data items representing documents.
Year
DOI
Venue
2009
10.1109/ICDIM.2009.5356767
2009 FOURTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT
Keywords
Field
DocType
feature extraction,machine learning classifier,accuracy,chemistry,classification,argon,machine learning,meta data,data mining,pipelines,learning artificial intelligence,text analysis
Data mining,Metadata,Metadata repository,Information retrieval,Author name,Computer science,Feature extraction,Artificial intelligence,Natural language processing,Classifier (linguistics),Learning classifier system
Conference
Citations 
PageRank 
References 
0
0.34
11
Authors
3
Name
Order
Citations
PageRank
Kerstin Denecke114023.57
thomas risse285240.30
Thomas Baehr300.34