Title
Community detection in content-sharing social networks
Abstract
Network structure and content in microblogging sites like Twitter influence each other - user A on Twitter follows user B for the tweets that B posts on the network, and A may then re-tweet the content shared by B to his/her own followers. In this paper, we propose a probabilistic model to jointly model link communities and content topics by leveraging both the social graph and the content shared by users. We model a community as a distribution over users, use it as a source for topics of interest, and jointly infer both communities and topics using Gibbs sampling. While modeling communities using the social graph, or modeling topics using content have received a great deal of attention, a few recent approaches try to model topics in content-sharing platforms using both content and social graph. Our work differs from the existing generative models in that we explicitly model the social graph of users along with the user-generated content, mimicking how the two entities co-evolve in content-sharing platforms. Recent studies have found Twitter to be more of a content-sharing network and less a social network, and it seems hard to detect tightly knit communities from the follower-followee links. Still, the question of whether we can extract Twitter communities using both links and content is open. In this paper, we answer this question in the affirmative. Our model discovers coherent communities and topics, as evinced by qualitative results on sub-graphs of Twitter users. Furthermore, we evaluate our model on the task of predicting follower-followee links. We show that joint modeling of links and content significantly improves link prediction performance on a sub-graph of Twitter (consisting of about 0.7 million users and over 27 million tweets), compared to generative models based on only structure or only content and paths-based methods such as Katz.
Year
DOI
Venue
2013
10.1145/2492517.2492546
ASONAM
Keywords
Field
DocType
gibbs sampling,existing generative model,social network,follower-followee links,user-generated content,twitter,follower-followee link,content management,model topic,model link community,content-sharing social networks,content topic,graph theory,sampling methods,social networking (online),content-sharing platform,twitter community,social graph,community detection,microblogging sites,probabilistic model,graph clustering,distributed algorithms
Data mining,Social graph,Social network,Computer science,Artificial intelligence,Content management,Clustering coefficient,Graph theory,World Wide Web,Social media,Microblogging,Distributed algorithm,Statistical model,Machine learning
Conference
Citations 
PageRank 
References 
15
0.91
20
Authors
3
Name
Order
Citations
PageRank
Nagarajan Natarajan140317.82
Prithviraj Sen283738.24
Vineet Chaoji342819.50