Title
An Empirical Evaluation of SVM on Meta Features for Authorship Attribution of Online Texts.
Abstract
Authorship attribution (AA) has been studied by many researchers. Recently, with the widespread of online texts, authorship attribution of online texts starts to receive a great deal of attentions. The essence of this problem is to identify a set of features that can capture the writing styles of an author. However, previous studies on feature identification mainly used statistical methods and conducted out experiments on small data sets, i.e., less than 10. This scale is distance from the real application of AA of online texts. In addition, due to the special characteristics of online texts, statistical approaches are rarely used for this problem. As the the performance of authorship identification depends highly on the the combination of the features used and classification methods, the feature sets for traditional authorship attribution needs to be re-examined using machine learning approaches. In this paper, we evaluate the effectiveness of six types of meta features on two public data sets with SVM, a well established machine learning technique. The experimental results show that lexical and syntactic features are the most promising features for AA of online texts. Furthermore, a number of interesting findings regarding the impacts of different types of features on authorship attribution are discovered through our experiments. © 2013 Springer International Publishing.
Year
DOI
Venue
2013
10.1007/978-3-319-03844-5_4
MIKE
Keywords
Field
DocType
authorship attribution of online texts,comparative evaluation,meta features
Data set,Small data,Computer science,Support vector machine,Writing style,Attribution,Artificial intelligence,Natural language processing,Syntax
Conference
Volume
Issue
ISSN
8284 LNAI
null
16113349
Citations 
PageRank 
References 
0
0.34
21
Authors
5
Name
Order
Citations
PageRank
Hongwei Yao100.68
Tieyun Qian217728.81
Li Chen39419.59
Manyun Qian400.68
Xueyu Mo500.34