Title
Feature identification for topical relevance assessment in feed search engines
Abstract
Feed has become a popular way to effectively distribute and acquire information on the web. The explosive growth of feeds demands a search engine that can help users quickly discover feeds of their interests. Retrieval effectiveness of feed search engine highly depends on a relevance assessment method that determines candidates for ranking query results. However, existing relevance assessment approaches proposed for web page retrieval may produce unsatisfactory result due to the different characteristics of feeds from traditional web pages. Compared to web pages, feed is a dynamic document since it continually generates information on some specific topics. In addition, it is a structured document that consists of several data elements such as title and description. Accordingly, the relevance assessment method for feed retrieval needs to effectively address these unique characteristics of feeds. This paper considers a problem of identifying significant features which are a feature set created from feed data elements, with the aim of improving effectiveness of feed retrieval while at the same time reducing computational cost. We conducted extensive experiments to investigate the problem using support vector machine on real-world data sets, and found the significant features that can be employed for feed search services.
Year
DOI
Venue
2013
10.3233/IDA-130602
Intell. Data Anal.
Keywords
Field
DocType
web page retrieval,feed data element,web page,feed retrieval,topical relevance assessment,feature identification,significant feature,traditional web page,feed search engine,retrieval effectiveness,relevance assessment method,feed search service
Structured document,Data mining,Data set,Search engine,Web page,Ranking,Information retrieval,Computer science,Support vector machine,Feature set,Artificial intelligence,Machine learning
Journal
Volume
Issue
ISSN
17
4
1088-467X
Citations 
PageRank 
References 
0
0.34
21
Authors
2
Name
Order
Citations
PageRank
Yongwook Shin1757.30
Jonghun Park249137.86