Title
Scalable Query N-Gram Embedding for Improving Matching and Relevance in Sponsored Search.
Abstract
Sponsored search has been the major source of revenue for commercial web search engines. It is crucial for a sponsored search engine to retrieve ads that are relevant to user queries to attract clicks as advertisers only pay when their ads get clicked. Retrieving relevant ads for a query typically involves in first matching related ads to the query and then filtering out irrelevant ones. Both require understanding the semantic relationship between a query and an ad. In this work, we propose a novel embedding of queries and ads in sponsored search. The query embeddings are generated from constituent word n-gram embeddings that are trained to optimize an event level word2vec objective over a large volume of search data. We show through a query rewriting task that the proposed query n-gram embedding model outperforms the state-of-the-art word embedding models for capturing query semantics. This allows us to apply the proposed query n-gram embedding model to improve query-ad matching and relevance in sponsored search. First, we use the similarity between a query and an ad derived from the query n-gram embeddings as an additional feature in the query-ad relevance model used in Yahoo Search. We show through online A/B test that using the new relevance model to filter irrelevant ads offline leads to 0.47% CTR and 0.32% revenue increase. Second, we propose a novel online query to ads matching system, built on an open-source big-data serving engine [30], using the learned query n-gram embeddings. Online A/B test shows that the new matching technique increases the search revenue by 2.32% as it significantly increases the ad coverage for tail queries.
Year
DOI
Venue
2018
10.1145/3219819.3219897
KDD
Keywords
Field
DocType
N-gram embedding,sponsored search,query-ad matching,query-ad relevance,distributed training,Apache Spark,Vespa
Data mining,Embedding,Search engine,Computer science,Filter (signal processing),n-gram,Word embedding,Word2vec,Semantics,Scalability
Conference
Volume
ISBN
Citations 
14
978-1-4503-5552-0
0
PageRank 
References 
Authors
0.34
17
7
Name
Order
Citations
PageRank
Xiao Bai11068.91
Erik Ordentlich267697.68
Yuanyuan Zhang312111.56
Andy Feng470.80
Adwait Ratnaparkhi51359292.99
Reena Somvanshi600.34
Aldi Tjahjadi700.34