Title
Semantic Query Labeling Through Synthetic Query Generation
Abstract
ABSTRACTSearching in a domain-specific corpus of structured documents (e.g., e-commerce, media streaming services, job-seeking platforms) is often managed as a traditional retrieval task or through faceted search. Semantic Query Labeling --- the task of locating the constituent parts of a query and assigning domain-specific predefined semantic labels to each of them --- allows leveraging the structure of documents during retrieval while leaving unaltered the keyword-based query formulation. Due to both the lack of a publicly available dataset and the high cost of producing one, there have been few published works in this regard. In this paper, basing on the assumption that a corpus already contains the information the users search, we propose a method for the automatic generation of semantically labeled queries and show that a semantic tagger --- based on BERT, gazetteers-based features, and Conditional Random Fields --- trained on our synthetic queries achieves results comparable to those obtained by the same model trained on real-world data. We also provide a large dataset of manually annotated queries in the movie domain suitable for studying Semantic Query Labeling. We hope that the public availability of this dataset will stimulate future research in this area.
Year
DOI
Venue
2021
10.1145/3404835.3463071
Research and Development in Information Retrieval
Keywords
DocType
Citations 
Semantic Query Labeling, Query generation, Vertical Search
Conference
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Elias Bassani112.08
Gabriella Pasi21673169.31