Title
A framework to compute page importance based on user behaviors
Abstract
This paper is concerned with a framework to compute the importance of webpages by using real browsing behaviors of Web users. In contrast, many previous approaches like PageRank compute page importance through the use of the hyperlink graph of the Web. Recently, people have realized that the hyperlink graph is incomplete and inaccurate as a data source for determining page importance, and proposed using the real behaviors of Web users instead. In this paper, we propose a formal framework to compute page importance from user behavior data (which covers some previous works as special cases). First, we use a stochastic process to model the browsing behaviors of Web users. According to the analysis on hundreds of millions of real records of user behaviors, we justify that the process is actually a continuous-time time-homogeneous Markov process, and its stationary probability distribution can be used as the measure of page importance. Second, we propose a number of ways to estimate parameters of the stochastic process from real data, which result in a group of algorithms for page importance computation (all referred to as BrowseRank). Our experimental results have shown that the proposed algorithms can outperform the baseline methods such as PageRank and TrustRank in several tasks, demonstrating the advantage of using our proposed framework.
Year
DOI
Venue
2010
10.1007/s10791-009-9098-8
Inf. Retr.
Keywords
Field
DocType
User browsing process,Continuous-time time-homogeneous Markov process,Staying time,BrowseRank
Data mining,PageRank,Markov process,Web mining,Information retrieval,Web page,TrustRank,Computer science,Stochastic process,Hyperlink,Computation
Journal
Volume
Issue
ISSN
13
1
1386-4564
Citations 
PageRank 
References 
14
0.75
17
Authors
5
Name
Order
Citations
PageRank
Yu-ting Liu11888.95
Tie-yan Liu24662256.32
Bin Gao366433.95
Zhi-Ming Ma422718.26
Hang Li56294317.05