Title
Blended metrics for novel sentence mining
Abstract
With the abundance of raw text documents available on the internet, many articles contain redundant information. Novel sentence mining can discover novel, yet relevant, sentences given a specific topic defined by a user. In real-time novelty mining, an important issue is to how to select a suitable novelty metric that quantitatively measures the novelty of a particular sentence. To utilize the merits of different metrics, a blended metric is proposed by combining both cosine similarity and new word count metrics. The blended metric has been tested on TREC 2003 and TREC 2004 Novelty Track data. The experimental results show that the blended metric can perform generally better on topics with different ratios of novelty, which is useful for real-time novelty mining in topics with varying degrees of novelty.
Year
DOI
Venue
2010
10.1016/j.eswa.2009.12.075
Expert Syst. Appl.
Keywords
Field
DocType
text mining,different metrics,novelty detection,different ratio,blended metric,new word count metrics,real-time novelty mining,cosine similarity,new word count,novelty track data,blended metrics,novel sentence mining,particular sentence,real time
Data mining,Novelty detection,Text mining,Information retrieval,Cosine similarity,Computer science,Word count,Artificial intelligence,Novelty,Sentence,Machine learning,The Internet
Journal
Volume
Issue
ISSN
37
7
Expert Systems With Applications
Citations 
PageRank 
References 
9
0.55
14
Authors
3
Name
Order
Citations
PageRank
Wenyin Tang1907.19
Flora S. Tsai235223.96
Lihui Chen338027.30