Title
Effective and Efficient Discovery of Top-k Meta Paths in Heterogeneous Information Networks
Abstract
<i>Heterogeneous information networks (HINs)</i> , which are typed graphs with labeled nodes and edges, have attracted tremendous interest from academia and industry. Given two HIN nodes <inline-formula><tex-math notation="LaTeX">$s$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula> , and a natural number <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> , we study the discovery of the <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> most important meta paths in real time, which can be used to support friend search, product recommendation, anomaly detection, and graph clustering. In this work, we argue that the shortest path between <inline-formula><tex-math notation="LaTeX">$s$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula> may not necessarily be the most important path. As such, we combine several ranking functions, which are based on <i>frequency</i> and <i>rarity</i> , to redefine the unified importance function of the meta paths between <inline-formula><tex-math notation="LaTeX">$s$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula> . Although this importance function can capture more information, it is very time-consuming to find top- <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula> meta paths using this importance function. Therefore, we integrate this importance function into a multi-step framework, which can efficiently filter some impossible meta paths between <inline-formula><tex-math notation="LaTeX">$s$</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">$t$</tex-math></inline-formula> . In addition, we combine bidirectional searching algorithm with this framework to further boost the efficiency performance. The experiment on different datasets shows that our proposed method outperforms state-of-the-art algorithms in terms of effectiveness with reasonable response time.
Year
DOI
Venue
2022
10.1109/TKDE.2020.3037218
IEEE Transactions on Knowledge and Data Engineering
Keywords
DocType
Volume
Heterogeneous information networks,top- $k$ k ,meta path
Journal
34
Issue
ISSN
Citations 
9
1041-4347
0
PageRank 
References 
Authors
0.34
18
6
Name
Order
Citations
PageRank
Zichen Zhu100.34
Tsz Nam Chan2225.40
Reynold Cheng33069154.13
Loc Do400.34
Zhipeng Huang5886.16
Haoci Zhang671.46