Abstract | ||
---|---|---|
We propose a novel training scheme for fast matching models in Search Ads, motivated by practical challenges. The first challenge stems from the pursuit of high throughput, which prohibits the deployment of inseparable architectures, and hence greatly limits model accuracy. The second problem arises from the heavy dependency on human provided labels, which are expensive and time-consuming to collect, yet how to leverage unlabeled search log data is rarely studied. The proposed training framework targets on mitigating both issues, by treating the stronger but undeployable models as annotators, and learning a deployable model from both human provided relevance labels and weakly annotated search log data. Specifically, we first construct multiple auxiliary tasks from the enumerated relevance labels, and train the annotators by jointly learning from those related tasks. The annotation models are then used to assign scores to both labeled and unlabeled training samples. The deployable model is firstly learnt on the scored unlabeled data, and then fine-tuned on scored labeled data, by leveraging both labels and scores via minimizing the proposed label-aware weighted loss. According to our experiments, compared with the baseline that directly learns from relevance labels, training by the proposed framework outperforms it by a large margin, and improves data efficiency substantially by dispensing with 80% labeled samples. The proposed framework allows us to improve the fast matching model by learning from stronger annotators while keeping its architecture unchanged. Meanwhile, it offers a principled manner to leverage search log data in the training phase, which could effectively alleviate our dependency on human provided labels.
|
Year | DOI | Venue |
---|---|---|
2019 | 10.1145/3308558.3313466 | WWW '19: The Web Conference
San Francisco
CA
USA
May, 2019 |
Keywords | DocType | Volume |
Search Ads, relevance matching, teacher-student, weak annotations | Journal | abs/1901.10710 |
ISBN | Citations | PageRank |
978-1-4503-6674-8 | 2 | 0.45 |
References | Authors | |
0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xue Li | 1 | 3 | 0.80 |
Zhipeng Luo | 2 | 2 | 0.45 |
Hao Sun | 3 | 3 | 0.80 |
Jianjin Zhang | 4 | 6 | 1.88 |
Weihao Han | 5 | 2 | 1.46 |
Xianqi Chu | 6 | 2 | 0.45 |
Liangjie Zhang | 7 | 2 | 1.13 |
Qi Zhang | 8 | 414 | 22.77 |