Title | ||
---|---|---|
An Advanced Partitioning Approach of Web Page Clustering utilizing Content & Link Structure |
Abstract | ||
---|---|---|
Clustering of non-homogenous documents has become an increasing challenge and opportunity with the huge proliferation of World Wide Web. It has become difficult to retrieve the desired information without proper clustering of Web-page with the increase in information on the WWW. Several new ideas have been proposed in recent years. Among them partitioning approach is still widely used clustering approach for its simplicity. This paper proposes a partitioning approach to cluster the Web-page based on information provided by the hyperlink structure of Web-pages and also by the content of the Web-pages. The proposed approach of Web-page clustering exhibits better result than K-medoid partitioning clustering approach as the centroids are chosen by HITS Algorithm. The partitioning approach like K- mediod, K-means require number of clusters apriori. It has been observed that the performance of these approaches depend on the initial selection centroids of the clusters. These two problems have been solved by the approach proposed in this paper. Experimental result supports our approach as better concept. |
Year | DOI | Venue |
---|---|---|
2009 | 10.4156/jcit.vol4.issue3.9 | JCIT |
Keywords | DocType | Volume |
hits algorithm,k-mediod,clustering | Journal | 4 |
Issue | Citations | PageRank |
3 | 9 | 0.69 |
References | Authors | |
5 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ruma Dutta | 1 | 38 | 4.87 |
Indranil Ghosh | 2 | 12 | 1.83 |
Anirban Kundu | 3 | 75 | 15.44 |
Debajyoti Mukhopadhyay | 4 | 172 | 38.42 |