Title
Node similarity in networked information spaces
Abstract
Networked information spaces contain information entities, corresponding to nodes, which are connected by associations, corresponding to links in the network. Examples of networked information spaces are: the World Wide Web, where information entities are web pages, and associations are hyperlinks: the scientific literature, where information entities are articles and associations are references to other articles. Similarity between information entities in a networked information space can be defined not only based on the content of the information entities, but also based on the connectivity established by the associations present. This paper explores the definition of similarity based on connectivity only, and proposes several algorithms for this purpose. Our metrics take advantage of the local neighborhoods of the nodes in the networked information space. Therefore, explicit availability of the networked information space is not required, as long as a query engine is available for following links and extracting the necessary local neighbourhoods for similarity estimation. Two variations of similarity estimation between two nodes are described, one based on the separate local neighbourhoods of the nodes, and another based on the joint local neighbourhood expanded from both nodes at the same time. The algorithms are implemented and evaluated on the citation graph of computer science. The immediate application of this work is in finding papers similar to a given paper in a digital library, but they are also applicable to other networked information spaces, such as the Web.
Year
Venue
Keywords
2001
CASCON
citation graph,local neighborhood,joint local neighbourhood,networked information space,separate local neighbourhood,information entity,computer science,node similarity,world wide web,necessary local neighbourhood,similarity estimation,digital libraries,web pages,digital library
Field
DocType
Citations 
Scientific literature,Web page,Information retrieval,Computer science,Neighbourhood (mathematics),Information space,Hyperlink,Citation graph,Digital library
Conference
22
PageRank 
References 
Authors
1.87
7
4
Name
Order
Citations
PageRank
Wangzhong Lu1543.17
Jeannette Janssen229532.23
Evangelos Milios33073360.46
Nathalie Japkowicz42581182.43