Probabilistic correlation-based similarity measure on text records. - Citegraph

Paper Info

Title
Probabilistic correlation-based similarity measure on text records.

Abstract
Large scale unstructured text records are stored in text attributes in databases and information systems, such as scientific citation records or news highlights. Approximate string matching techniques for full text retrieval, e.g., edit distance and cosine similarity, can be adopted for unstructured text record similarity evaluation. However, these techniques do not show the best performance when applied directly, owing to the difference between unstructured text records and full text. In particular, the information are limited in text records of short length, and various information formats such as abbreviation and data missing greatly affect the record similarity evaluation.

Year	DOI	Venue
2014	10.1016/j.ins.2014.08.007	Information Sciences
Keywords	Field	DocType
Similarity measure,Probabilistic correlation,Text record	Edit distance,Information system,Cosine similarity,Similarity measure,Information retrieval,Computer science,Correlation,Approximate string matching,Probabilistic logic,Document retrieval	Journal
Volume	ISSN	Citations
289	0020-0255	6
PageRank	References	Authors
0.55	27	3

Authors (3 rows)

Cited by (6 rows)

References (27 rows)

Name	Order	Citations	PageRank
Shaoxu Song	1	259	31.50
Han Zhu	2	215	8.48
Lei Chen	3	6239	395.84

1