Sentence Similarity Based on Contexts - Citegraph

Paper Info

Title
Sentence Similarity Based on Contexts

Abstract
Existing methods to measure sentence similarity are faced with two challenges: (1) labeled datasets are usually limited in size, making them insufficient to train supervised neural models; (2) there is a training-test gap for unsupervised language modeling (LM) based models to compute semantic scores between sentences, since sentence-level semantics are not explicitly modeled at training. This results in inferior performances in this task. In this work, we propose a new framework to address these two issues. The proposed framework is based on the core idea that the meaning of a sentence should be defined by its contexts, and that sentence similarity can be measured by comparing the probabilities of generating two sentences given the same context. The proposed framework is able to generate high-quality, large-scale dataset with semantic similarity scores between two sentences in an unsupervised manner, with which the train-test gap can be largely bridged. Extensive experiments show that the proposed framework achieves significant performance boosts over existing baselines under both the supervised and unsupervised settings across different datasets.

Year	DOI	Venue
2022	10.1162/TACL_A_00477	Transactions of the Association for Computational Linguistics
DocType	Volume	Citations
Journal	10	0
PageRank	References	Authors
0.34	0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiaofei Sun	1	0	3.38
Yuxian Meng	2	0	6.08
Xiang Ao	3	34	8.49
Fei Wu	4	2209	153.88
Tianwei Zhang	5	0	2.37
Jiwei Li	6	1028	48.05
Chun Fan	7	4	4.38

1