Title
Mining Summaries of Propagations
Abstract
Analyzing the traces left by a meme of information propagating through a social network or by a user browsing a website can help to unveil the structure and dynamics of such complex networks. This may in turn open the door to concrete applications, such as finding influential users for a topic in a social network, or detecting the typical structure of a web browsing session that leads to a product purchase. In this paper we define the problem of mining summaries of propagations as a constrained pattern-mining problem. A propagation is a DAG where an entity (e.g., information exchanged in a social network, or a user browsing a website) flows following the underlying hierarchical structure of the nodes. A summary is a set of propagations that (i) involve a similar population of nodes, and (ii) exhibit a coherent hierarchical structure when merged altogether to form a single graph. The first constraint is defined based on the Jaccard coefficient, while the definition of the second one relies on the graph-theoretic concept of "agony" of a graph. It turns out that both constraints satisfy the downward closure property, thus enabling Apriori-like algorithms. However, motivated by the fact that computing agony is much more expensive than computing Jaccard, we devise two algorithms that explore the search space differently. The first algorithm is an Apriori-like, bottom-up method that checks both the constraints level-by-level. The second algorithm consists of a first phase where the search space is pruned as much as possible by exploiting the Jaccard constraint only, while involving the second constraint only afterwards, in a subsequent phase. We test our algorithms on four real-world datasets. Quantitative results reveal that the choice of the most efficient algorithm depends on the selectivity of the two constraints. Qualitative results show the relevance of the extracted summaries in a number of real-world scenarios.
Year
DOI
Venue
2013
10.1109/ICDM.2013.163
Data Mining
Keywords
Field
DocType
data mining,directed graphs,social networking (online),Apriori-like algorithms,Apriori-like bottom-up method,DAG,Jaccard coefficient,Jaccard constraint,Web browsing session,Web site,agony,complex networks,constrained pattern mining problem,downward-closure property,entity,graph-theoretic concept,hierarchical structure,information propagation,search space,social network,summary mining,Apriori,graphs,influence maximization,information propagation,pattern mining,social networks,web browsing
Graph,Data mining,Population,Social network,Computer science,A priori and a posteriori,Directed graph,Artificial intelligence,Jaccard index,Web navigation,Complex network,Machine learning
Conference
ISSN
Citations 
PageRank 
1550-4786
5
0.49
References 
Authors
23
4
Name
Order
Citations
PageRank
Lucrezia Macchia150.49
Francesco Bonchi24173200.47
Francesco Gullo348332.63
Luca Chiarandini450.49