Title
Meta-Graph Based HIN Spectral Embedding: Methods, Analyses, and Insights
Abstract
Heterogeneous information network (HIN) has drawn significant research attention recently, due to its power of modeling multi-typed multi-relational data and facilitating various downstream applications. In this decade, many algorithms have been developed for HIN modeling, including traditional similarity measures and recent embedding techniques. Most algorithms on HIN leverage meta-graphs or meta-paths (special cases of meta-graphs) to capture various semantics. Given any arbitrary set of meta-graphs, existing algorithms either consider them as equally important or study their different importance through supervised learning. Their performance largely relies on prior knowledge and labeled data. While unsupervised embedding has shown to be a fundamental solution for various homogeneous network mining tasks, for HIN, it is a much harder problem due to such a presence of various meta-graphs. In this work, we propose to study the utility of different meta-graphs, as well as how to simultaneously leverage multiple meta-graphs for HIN embedding in an unsupervised manner. Motivated by prolific research on homogeneous networks, especially spectral graph theory, we firstly conduct a systematic empirical study on the spectrum and embedding quality of different meta-graphs on multiple HINs, which leads to an efficient method of meta-graph assessment. It also helps us to gain valuable insight into the higher-order organization of HINs and indicates a practical way of selecting useful embedding dimensions. Further, we explore the challenges of combining multiple meta-graphs to capture the multi-dimensional semantics in HIN through reasoning from mathematical geometry and arrive at an embedding compression method of autoencoder with l2,1-loss, which finds the most informative meta-graphs and embeddings in an end-to-end unsupervised manner. Finally, empirical analysis suggests a unified workflow to close the gap between our meta-graph assessment and combination methods. To the best of our knowledge, this is the first research effort to provide rich theoretical and empirical analyses on the utility of meta-graphs and their combinations, especially regarding HIN embedding. Extensive experimental comparisons with various state-of-the-art neural network based embedding methods on multiple real-world HINs demonstrate the effectiveness and efficiency of our framework in finding useful meta-graphs and generating high-quality HIN embeddings.
Year
DOI
Venue
2018
10.1109/ICDM.2018.00081
2018 IEEE International Conference on Data Mining (ICDM)
Keywords
Field
DocType
heterogeneous data,spectral analysis,network embedding
Graph theory,Data modeling,Spectral graph theory,Autoencoder,Embedding,Computer science,Supervised learning,Artificial intelligence,Artificial neural network,Empirical research,Machine learning
Conference
ISSN
ISBN
Citations 
1550-4786
978-1-5386-9160-1
8
PageRank 
References 
Authors
0.43
25
5
Name
Order
Citations
PageRank
Carl Yang119617.33
Yichen Feng280.43
Pan Li34111.95
Yu Shi4685.37
Jiawei Han5430853824.48