Title | ||
---|---|---|
Scalable Query N-Gram Embedding for Improving Matching and Relevance in Sponsored Search. |
Abstract | ||
---|---|---|
Sponsored search has been the major source of revenue for commercial web search engines. It is crucial for a sponsored search engine to retrieve ads that are relevant to user queries to attract clicks as advertisers only pay when their ads get clicked. Retrieving relevant ads for a query typically involves in first matching related ads to the query and then filtering out irrelevant ones. Both require understanding the semantic relationship between a query and an ad. In this work, we propose a novel embedding of queries and ads in sponsored search. The query embeddings are generated from constituent word n-gram embeddings that are trained to optimize an event level word2vec objective over a large volume of search data. We show through a query rewriting task that the proposed query n-gram embedding model outperforms the state-of-the-art word embedding models for capturing query semantics. This allows us to apply the proposed query n-gram embedding model to improve query-ad matching and relevance in sponsored search. First, we use the similarity between a query and an ad derived from the query n-gram embeddings as an additional feature in the query-ad relevance model used in Yahoo Search. We show through online A/B test that using the new relevance model to filter irrelevant ads offline leads to 0.47% CTR and 0.32% revenue increase. Second, we propose a novel online query to ads matching system, built on an open-source big-data serving engine [30], using the learned query n-gram embeddings. Online A/B test shows that the new matching technique increases the search revenue by 2.32% as it significantly increases the ad coverage for tail queries.
|
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/3219819.3219897 | KDD |
Keywords | Field | DocType |
N-gram embedding,sponsored search,query-ad matching,query-ad relevance,distributed training,Apache Spark,Vespa | Data mining,Embedding,Search engine,Computer science,Filter (signal processing),n-gram,Word embedding,Word2vec,Semantics,Scalability | Conference |
Volume | ISBN | Citations |
14 | 978-1-4503-5552-0 | 0 |
PageRank | References | Authors |
0.34 | 17 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiao Bai | 1 | 106 | 8.91 |
Erik Ordentlich | 2 | 676 | 97.68 |
Yuanyuan Zhang | 3 | 121 | 11.56 |
Andy Feng | 4 | 7 | 0.80 |
Adwait Ratnaparkhi | 5 | 1359 | 292.99 |
Reena Somvanshi | 6 | 0 | 0.34 |
Aldi Tjahjadi | 7 | 0 | 0.34 |