Abstract | ||
---|---|---|
We propose a method for automatically labelling topics learned via LDA topic models. We generate our label candidate set from the top-ranking topic terms, titles of Wikipedia articles containing the top-ranking topic terms, and sub-phrases extracted from the Wikipedia article titles. We rank the label candidates using a combination of association measures and lexical features, optionally fed into a supervised ranking model. Our method is shown to perform strongly over four independent sets of topics, significantly better than a benchmark method. |
Year | Venue | Keywords |
---|---|---|
2011 | meeting of the association for computational linguistics | lexical feature,automatic labelling,labelling topic,top-ranking topic term,wikipedia article,lda topic model,wikipedia article title,label candidate,independent set,association measure,benchmark method |
Field | DocType | Volume |
Ranking,Information retrieval,Computer science,Labelling,Artificial intelligence,Natural language processing,Topic model | Conference | P11-1 |
Citations | PageRank | References |
58 | 2.12 | 24 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jey Han Lau | 1 | 660 | 36.88 |
Karl Grieser | 2 | 295 | 11.68 |
David Newman | 3 | 1319 | 73.72 |
Timothy Baldwin | 4 | 1767 | 116.85 |