Title
Short text similarity based on probabilistic topics
Abstract
In this paper, we propose a new method for measuring the similarity between two short text snippets by comparing each of them with the probabilistic topics. Specifically, our method starts by firstly finding the distinguishing terms between the two short text snippets and comparing them with a series of probabilistic topics, extracted by Gibbs sampling algorithm. The relationship between the distinguishing terms of the short text snippets can be discovered by examining their probabilities under each topic. The similarity between two short text snippets is calculated based on their common terms and the relationship of their distinguishing terms. Extensive experiments on paraphrasing and question categorization show that the proposed method can calculate the similarity of short text snippets more accurately than other methods including the pure TF-IDF measure.
Year
DOI
Venue
2010
10.1007/s10115-009-0250-y
Knowl. Inf. Syst.
Keywords
Field
DocType
probabilistic topic,short text similarity,distinguishing term,pure tf-idf measure,question categorization show,common term,short text snippet,gibbs sampling algorithm,new method,extensive experiment,text similarity measures · information retrieval · query expansion · text mining · question answering,gibbs sampling,text mining,question answering,information retrieval,query expansion
Categorization,Similitude,Data mining,Question answering,Information retrieval,Query expansion,Computer science,Probabilistic logic,Case-based reasoning,Gibbs sampling
Journal
Volume
Issue
ISSN
25
3
0219-3116
Citations 
PageRank 
References 
31
1.13
19
Authors
5
Name
Order
Citations
PageRank
Xiaojun Quan126020.64
Gang Liu2834.93
Zhi Lu325711.74
Xingliang Ni4753.71
Liu Wenyin52531215.13