Title
A Novel Approach to Cluster Web Traversal Patterns Based on Edit Distance.
Abstract
Edit distance, as a similarity measure between user traversal patterns, satisfies the need of varying-length of user traversal sequences very well because it can be computed between different-length symbol strings which needs lower time and storage expense. Moreover, web topology is skillfully used to compute the relationship between pages which is used as a measure of cost of an edit operation. Finally, two-threshold sequential clustering method (TTSCM) is used to cluster user traversal patterns avoiding specifying the number of cluster in advance, and reducing the dependency between the clustering results and the clustering order of traversal patterns. Experimental results test and verify the effectiveness and flexibility of our proposed methods.
Year
DOI
Venue
2011
10.1007/978-3-642-24273-1_60
Communications in Computer and Information Science
Keywords
Field
DocType
Edit distance,Clustering,Traversal Pattern,Web Topology
Edit distance,Data mining,Tree traversal,Graph traversal,Similarity measure,Computer science,Theoretical computer science,Cluster analysis
Conference
Volume
Issue
ISSN
238
null
1865-0929
Citations 
PageRank 
References 
0
0.34
5
Authors
2
Name
Order
Citations
PageRank
Xiaoqiu Tan1111.65
Miaojun Xu250.78