Title
HDPauthor: A New Hybrid Author-Topic Model using Latent Dirichlet Allocation and Hierarchical Dirichlet Processes.
Abstract
We present a new approach towards capturing topic interests corresponding to all the observed latent topics generated by an author in documents to which he or she has contributed. Topic models based on Latent Dirichlet Allocation (LDA) have been built for this purpose but are brittle as to the number of topics allowed for a collection and for each author of documents within the collection. Meanwhile, topic models based upon Hierarchical Dirichlet Processes (HDPs) allow an arbitrary number of topics to be discovered and generative distributions of interest inferred from text corpora, but this approach is not directly extensible to generative models of authors as contributors to documents with variable topical expertise. Our approach combines an existing HDP framework for learning topics from free text with latent authorship learning within a generative model using author list information. This model adds another layer into the current hierarchy of HDPs to represent topic groups shared by authors, and the document topic distribution is represented as a mixture of topic distribution of its authors. Our model automatically learns author contribution partitions for documents in addition to topics.
Year
DOI
Venue
2016
10.1145/2872518.2890561
WWW (Companion Volume)
Field
DocType
Citations 
Dynamic topic model,Hierarchical Dirichlet process,Data mining,Latent Dirichlet allocation,Computer science,Pachinko allocation,Artificial intelligence,Dirichlet distribution,Information retrieval,Text corpus,Topic model,Machine learning,Generative model
Conference
3
PageRank 
References 
Authors
0.39
6
2
Name
Order
Citations
PageRank
Ming Yang131.07
William H. Hsu2155.06