Title
Top-k Similarity Join in Heterogeneous Information Networks
Abstract
As a newly emerging network model, heterogeneous information networks (HINs) have received growing attention. Many data mining tasks have been explored in HINs, including clustering, classification, and similarity search. Similarity join is a fundamental operation required for many problems. It is attracting attention from various applications on network data, such as friend recommendation, link prediction, and online advertising. Although similarity join has been well studied in homogeneous networks, it has not yet been studied in heterogeneous networks. Especially, none of the existing research on similarity join takes different semantic meanings behind paths into consideration and almost all completely ignore the heterogeneity and diversity of the HINs. In this paper, we propose a path-based similarity join (PS-join) method to return the top similar pairs of objects based on any user specified join path in a heterogeneous information network. We study how to prune expensive similarity computation by introducing bucket pruning based locality sensitive hashing (BPLSH) indexing. Compared with existing Link-based Similarity join (LS-join) method, PS-join can derive various similarity semantics. Experimental results on real data sets show the efficiency and effectiveness of the proposed approach.
Year
DOI
Venue
2015
10.1109/TKDE.2014.2373385
Knowledge and Data Engineering, IEEE Transactions  
Keywords
Field
DocType
heterogeneous network,graph,similarity join,graph theory,vectors,knowledge engineering,data engineering,indexing,similarity search,classification,data mining,database indexing,semantics,clustering
Locality-sensitive hashing,Data mining,Computer science,Search engine indexing,Information engineering,Heterogeneous network,Cluster analysis,Semantics,Nearest neighbor search,Network model
Journal
Volume
Issue
ISSN
27
6
1041-4347
Citations 
PageRank 
References 
20
0.69
31
Authors
3
Name
Order
Citations
PageRank
Yun Xiong113626.42
Yangyong Zhu224331.66
Philip S. Yu3306703474.16