Robust semantic text similarity using LSA, machine learning, and linguistic resources. - Citegraph

Paper Info

Title
Robust semantic text similarity using LSA, machine learning, and linguistic resources.

Abstract
Semantic textual similarity is a measure of the degree of semantic equivalence between two pieces of text. We describe the SemSim system and its performance in the SEM 2013 and SemEval-2014 tasks on semantic textual similarity. At the core of our system lies a robust distributional word similarity component that combines latent semantic analysis and machine learning augmented with data from several linguistic resources. We used a simple term alignment algorithm to handle longer pieces of text. Additional wrappers and resources were used to handle task specific challenges that include processing Spanish text, comparing text sequences of different lengths, handling informal words and phrases, and matching words with sense definitions. In the SEM 2013 task on , our best performing system ranked first among the 89 submitted runs. In the SemEval-2014 task on , we ranked a close second in both the English and Spanish subtasks. In the SemEval-2014 task on , we ranked first in Sentence–Phrase, Phrase–Word, and Word–Sense subtasks and second in the Paragraph–Sentence subtask.

Year	DOI	Venue
2016	https://doi.org/10.1007/s10579-015-9319-2	Language Resources and Evaluation
Keywords	Field	DocType
Latent semantic analysis,WordNet,Term alignment,Semantic similarity	Computer science,Explicit semantic analysis,Natural language processing,Probabilistic latent semantic analysis,Artificial intelligence,Semantic computing,Semantic compression,Semantic similarity,SemEval,Information retrieval,Semantic equivalence,Latent semantic analysis,Linguistics,Machine learning	Journal
Volume	Issue	ISSN
50	1	1574-020X
Citations	PageRank	References
7	0.53	25
Authors
7

Authors (7 rows)

Cited by (7 rows)

References (25 rows)

Name	Order	Citations	PageRank
abhay kashyap	1	122	4.65
Lushan Han	2	246	15.41
Roberto Yus	3	115	18.18
Jennifer Sleeman	4	49	7.99
Taneeya Satyapanich	5	7	0.53
Sunil Gandhi	6	36	4.85
Timothy W. Finin	7	7345	821.22

1