Title
Improving relevance in a content pipeline via syntactic generalization.
Abstract
This is a report from the field on a linguistic-based relevance technology based on learning of parse trees for processing, classification and delivery of a stream of texts. We describe the content pipeline for eBay entertainment domain which employs this technology, and show that text processing relevance is the main bottleneck for its performance. A number of components of the content pipeline such as content mining, aggregation, deduplication, opinion mining, integrity enforcing need to rely on domain-independent efficient text classification, entity extraction and relevance assessment operations.
Year
DOI
Venue
2017
10.1016/j.engappai.2016.11.001
Engineering Applications of Artificial Intelligence
Keywords
Field
DocType
Content pipeline,Relevance of text classification,Machine learning of syntactic parse trees,Personalized recommendation
Data deduplication,Bottleneck,Web mining,Information retrieval,Computer science,Sentiment analysis,Cardinality,Artificial intelligence,Parsing,Machine learning,Personalization,Text processing
Journal
Volume
ISSN
Citations 
58
0952-1976
2
PageRank 
References 
Authors
0.36
37
1
Name
Order
Citations
PageRank
Boris Galitsky124837.81