Title
Ranking concrete and abstract words using Google Books Ngram data
Abstract
Creation of dictionaries of abstract and concrete words is a well-known task. Such dictionaries are important in several applications of text analysis and computational linguistics. Usually, the process of assembling of concreteness scores for words begins with a lot of manual work. However, the process can be automated significantly using information from large corpora. In this paper we combine two datasets: a dictionary with concreteness scores of 40,000 English words and the GoogleBooks Ngram dataset, in order to test the following hypothesis: in text concrete words tend to occur with more concrete words, than with abstract words (and inverse: abstract words tend to occur with more abstract words, than with concrete words). Using the hypothesis, we proposed a method for automatic evaluation concreteness scores of words using a small amount of initial markup.
Year
DOI
Venue
2020
10.3233/JIFS-179886
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS
Keywords
DocType
Volume
Concreteness of words,bigrams,dictionary
Journal
39
Issue
ISSN
Citations 
2
1064-1246
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Vladimir Ivanov13011.48
Valery Solovyev23810.57