Title
Brief Communication Adjacency and proximity searching in the Science Citation Index and Google
Abstract
We have developed simple algorithms that allow adjacency and proximity searching in Google and the Science Citation Index (SCI). The SCI algorithm exploits the fact that SCI stopwords in a search phrase function as a placeholder. Such a phrase serves effectively as a fixed adjacency condition determined by the numbern of adjacent stopwords (i.e. retrieve all records where word A and word B are separated byn words in at least one location). The algorithm integrates over search phrases with different numbers of adjacent stopwords to provide a flexible adjacency or proximity capability (i.e. retrieve all records where word A and word B are separated byn or fewer words in at least one location, wheren is the maximum separation desired between A and B in at least one location). The Google algorithm exploits the fact that asterisks (in Google) separating words in a phrase function like word wildcards. The difference between two such phrases (the first phrase containing one fewer asterisk than the second phrase) serves effectively as a fixed adjacency or proximity condition, with the number of separating words equal to the number of asterisks in the first phrase. The algorithm integrates over these phrase differentials to provide a flexible adjacency or proximity capability (i.e. retrieve all records where word A and word B are separated byn or fewer words in at least one location, wheren is the maximum separation desired between A and B in at least one location).
Year
DOI
Venue
2006
10.1177/0165551506067126
J. Information Science
Keywords
DocType
Volume
word A,adjacent stopwords,Brief Communication Adjacency,fewer word,proximity capability,word B,byn word,separated byn,maximum separation,Science Citation Index,phrase differential,flexible adjacency
Journal
32
Issue
ISSN
Citations 
6
0165-5515
2
PageRank 
References 
Authors
0.41
1
3
Name
Order
Citations
PageRank
Ronald N. Kostoff133838.42
John T. Rigsby2252.98
Ryan B. Barth360.85