Title
Constructing good quality web page communities
Abstract
World Wide Web is a rich source of information and continues to expand in size and complexity. To capture the features of the Web at a higher level to realise the information classification and efficient retrieval on the Web is becoming a challenge task. One natural way is to exploit the linkage information among the Web pages. Previous work such as HITS in this area is based on a set of retrieved pages to get a Web community that is a bunch of pages related to the query topics. Since the set of retrieved pages may contain many unrelated pages (noise pages) to the query topics, the obtained Web community sometimes is unsatisfactory. In this paper, we propose an innovative algorithm to eliminate noise pages from set of retrieved pages and improve its quality. This improvement will enable existing community construction algorithms to construct good quality Web page communities. The proposed algorithm reveals and takes advantage of the relationships among concerned Web pages at a deeper level. The numerical experiment results show the effectiveness and feasibility of the algorithm. This algorithm could also be used solely to filter unnecessary Web pages and reduce the management cost and burden of Web-based data management systems. The ideas in the algorithm can also be applied to other hyperlink analysis.
Year
Venue
Keywords
2002
Australasian Database Conference
good quality web page,unnecessary web page,existing community construction algorithm,web community,query topic,concerned web page,noise page,web page,innovative algorithm,world wide web
DocType
ISBN
Citations 
Conference
0-909925-83-6
14
PageRank 
References 
Authors
0.76
7
2
Name
Order
Citations
PageRank
Jingyu Hou118116.93
Yanchun Zhang23059284.90