Title
Authorship Identification for Online Text
Abstract
Authorship identification for online text such as blogs and e-books is a challenging problem as these documents do not have a considerable amount of content. Therefore, identification is much harder than other documents such as books and reports. The paper investigates the choice of features and classifier accuracy which are suitable for such texts. Syntactic features are found to be good for large data sets, whereas lexical features are good for small data sets. The results can be used to customize and further improve authorship detection techniques according to the characteristics of the writing samples.
Year
DOI
Venue
2010
10.1109/CW.2010.50
Cyberworlds
Keywords
Field
DocType
lexical feature,online text,authorship identification,large data set,authorship detection technique,considerable amount,challenging problem,small data set,syntactic feature,classifier accuracy,databases,feature extraction,classification,writing,data mining,text analysis,accuracy
Text mining,Data set,Small data,Information retrieval,Computer science,Feature extraction,Classifier (linguistics),Syntax,Vocabulary
Conference
ISBN
Citations 
PageRank 
978-0-7695-4215-7
11
0.69
References 
Authors
7
2
Name
Order
Citations
PageRank
Richmond Hong Rui Tan1110.69
Flora S. Tsai235223.96