Title
A new document author representation for authorship attribution
Abstract
This paper proposes a novel representation for Authorship Attribution (AA), based on Concise Semantic Analysis (CSA), which has been successfully used in Text Categorization (TC). Our approach for AA, called Document Author Representation (DAR), builds document vectors in a space of authors, calculating the relationship between textual features and authors. In order to evaluate our approach, we compare the proposed representation with conventional approaches and previous works using the c50 corpus. We found that DAR can be very useful in AA tasks, because it provides good performance on imbalanced data, getting comparable or better accuracy results.
Year
DOI
Venue
2012
10.1007/978-3-642-31149-9_29
MCPR
Keywords
Field
DocType
c50 corpus,novel representation,authorship attribution,better accuracy result,text categorization,concise semantic analysis,conventional approach,aa task,new document author representation,document author,proposed representation
Information retrieval,Computer science,Document representation,Attribution,Natural language processing,Artificial intelligence,Text categorization
Conference
Citations 
PageRank 
References 
4
0.49
16
Authors
5