Title
Exploiting PageRank at Different Block Level
Abstract
In recent years, information retrieval methods focusing on the link analysis have been developed; The PageRank and HITS are two typical ones According to the hierarchical organization of Web pages, we could partition the Web graph into blocks at different level. such as page level, directory level, host level and domain level. On the basis of block, we could analyze the different hyperlinks among pages. Several approaches proposed that the intra-hyperlink in a host maybe less useful in computing the PageRank. However. there are no reports on how concretely the intra- or inter-hyperlink affects the PageRank. Furthermore, based on different block level, inter-hyperlink and intra-hyperlink can be two relative concepts. Thus which level should be optimal to distinguish the intra- or inter-hyperlink? And how the ratio set between the intra-hyperlink and inter-hyperlink could ultimately improve performance of the PageRank algorithm? In this paper, we analyze the link distribution at the different block level and evaluate the importance of the intra- and interhyperlink to PageRank on the TREC Web Track data set. Experiment shows that, if we set the block at host level and the ratio of the weight between the intra-hyperlink and inter-hyperlink is 1:4. the retrieval could achieve the best performance.
Year
DOI
Venue
2004
10.1007/978-3-540-30480-7_26
Lecture Notes in Computer Science
Keywords
Field
DocType
web pages,information retrieval,link analysis
Information system,Data mining,PageRank,Web page,Link analysis,Directory,Computer science,Hyperlink,Database,The Internet,Hierarchical organization
Conference
Volume
ISSN
Citations 
3306
0302-9743
11
PageRank 
References 
Authors
0.66
14
6
Name
Order
Citations
PageRank
Xue-Mei Jiang1254.03
Gui-rong Xue22728126.58
Wen-Guan Song3181.80
Hua-Jun Zeng41999100.54
Zheng Chen55019256.89
Wei-ying Ma6145871003.11