Title
Document-topic hierarchies from document graphs
Abstract
Topic taxonomies present a multi-level view of a document collection, where general topics live towards the top of the taxonomy and more specific topics live towards the bottom. Topic taxonomies allow users to quickly drill down into their topic of interest to find documents. We show that hierarchies of documents, where documents live at the inner nodes of the hierarchy-tree can also be inferred by combining document text with inter-document links. We present a Bayesian generative model by which an explicit hierarchy of documents is created. Experiments on three document-graph data sets shows that the generated document hierarchies are able to fit the observed data, and that the levels in the constructed document hierarchy represent practical groupings.
Year
DOI
Venue
2012
10.1145/2396761.2396843
CIKM
Keywords
Field
DocType
specific topic,observed data,explicit hierarchy,document hierarchy,document collection,topic taxonomy,document graph,document text,document-topic hierarchy,bayesian generative model,document-graph data,general topic,hierarchical clustering,topic models
Hierarchical clustering,Data mining,Graph,Information retrieval,Document clustering,Computer science,Drill down,Topic model,Hierarchy,Bayesian probability,Generative model
Conference
Citations 
PageRank 
References 
7
0.45
27
Authors
3
Name
Order
Citations
PageRank
Tim Weninger157646.14
Yonatan Bisk219617.54
Jiawei Han3430853824.48