Title
Using Complex Networks to Quantify Consistency in the Use of Words
Abstract
In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition.
Year
DOI
Venue
2013
10.1088/1742-5468/2012/01/P01004
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT
Keywords
Field
DocType
data mining (experiment),pattern formation (experiment),random graphs,networks,communication,supply and information networks
Word usage,Data mining,Zipf's law,Random graph,Quantum mechanics,Emotion recognition,Complex network,Artificial intelligence,Natural language processing,Ambiguity,Ranking,Age of Acquisition,Mathematics
Journal
Volume
Issue
ISSN
abs/1302.4107
01
1742-5468
Citations 
PageRank 
References 
5
0.82
14
Authors
3
Name
Order
Citations
PageRank
Diego R. Amancio135229.53
Osvaldo N. Oliveira Jr.224717.25
Luciano da Fontoura Costa354263.09