Title
Wordrank: A Method for Ranking Web Pages Based on Content Similarity
Abstract
This paper presents WordRank, a new page ranking system, which exploits similarity between interconnected pages. WordRank introduces the model of the 'biased surfer' which is based on the following assumption: "the visitor of a Web page tends to visit Web pages with similar content rather than content irrelevant pages". The algorithm modifies the random surfer model by biasing the probability of a user to follow a link in favor of links to pages with similar content. It is our intuition that WordRank is most appropriate in topic based searches, since it prioritizes strongly interconnected pages, and in the same time is more robust to the multitude of topics and to the noise produced by navigation links. This paper presents preliminary experimental evidence from a search engine we developed for the Greek fragment of the World Wide Web. For evaluation purposes, we introduce a new metric (SI score) which is based on implicit user's feedback, but we also employ explicit evaluation, where available.
Year
DOI
Venue
2007
10.1109/BNCOD.2007.24
BNCOD Workshops
Keywords
Field
DocType
web page,similar content,evaluation purpose,content irrelevant page,ranking web pages,si score,new page,greek fragment,implicit user,explicit evaluation,content similarity,random surfer model,search engine,navigation,information retrieval,probability,databases,search engines,web pages,random processes,feedback
Static web page,Data mining,HITS algorithm,Web page,Computer science,Page view,Printer-friendly,World Wide Web,Ranking,Information retrieval,Exploit,Visitor pattern,Database
Conference
ISBN
Citations 
PageRank 
0-7695-2912-7
3
0.43
References 
Authors
16
3
Name
Order
Citations
PageRank
Apostolos Kritikopoulos1735.84
Martha Sideri240946.17
Iraklis Varlamis350352.08