Title
The aboutness of words.
Abstract
Word aboutness is defined as the relationship between words and subjects associated with them. An aboutness coefficient is developed to estimate the strength of the aboutness relationship. Words that are randomly distributed across subjects are assumed to lack aboutness and the degree to which their usage deviates from a random pattern indicates the strength of the aboutness. To estimate aboutness, title words and their associated subjects are extracted from the titles of non-fiction English language books in the OCLC WorldCat database. The usage patterns of the title words are analyzed and used to compute aboutness coefficients for each of the common title words. Words with low aboutness coefficients (An and In) are commonly found in stop word lists, whereas words with high aboutness coefficients (Carbonate, Autism) are unambiguous and have a strong subject association. The aboutness coefficient potentially can enhance indexing, advance authority control, and improve retrieval.
Year
DOI
Venue
2017
10.1002/asi.23856
JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY
Field
DocType
Volume
English language,Random pattern,Information retrieval,Computer science,Search engine indexing,Aboutness,Authority control,Natural language processing,Artificial intelligence,Stop words
Journal
68.0
Issue
ISSN
Citations 
10.0
2330-1635
1
PageRank 
References 
Authors
0.36
7
3
Name
Order
Citations
PageRank
Edward T. O'neill16911.99
Kerre A. Kammerer210.36
Rick Bennett310.36