Title
GEDIT: Geographic-Enhanced and Dependency-Guided Tagging for Joint POI and Accessibility Extraction at Baidu Maps
Abstract
ABSTRACTProviding timely accessibility reminders (such as closed and relocated) of a point-of-interest (POI) plays a vital role in improving user satisfaction of finding places and making visiting decisions. However, it is difficult to keep the POI database in sync with the real-world counterparts due to the dynamic nature of business changes and innovations. To alleviate this problem, we formulate and present a practical solution that jointly extracts POI mentions and identifies their coupled accessibility labels from unstructured text (hereafter referred to as joint POI and accessibility extraction). We approach this task as a sequence tagging problem, where the goal is to produce (POI name, accessibility label) pairs from unstructured text. This task is challenging because of two main issues: (1) POI names are often newly-coined words so as to successfully register new entities or brands and (2) there may exist multiple pairs in the text, which necessitates dealing with one-to-many or many-to-one mapping to make each POI coupled with its matching accessibility label. To this end, we propose a Geographic-Enhanced and Dependency-guIded sequence Tagging (GEDIT) model to concurrently address the two challenges. First, to alleviate challenge #1, we develop a geographic-enhanced pre-trained model to learn the text representations, which is able to significantly relieve the problem of newly-coined words. Second, to mitigate challenge #2, we apply a relational graph convolutional network to learn the tree node representations from the parsed dependency tree, which enables us to establish a correlation between a POI and its accessibility label. Finally, we construct a neural sequence tagging model by integrating and feeding the previously pre-learned representations into a CRF layer. Extensive experiments conducted on a real-world dataset demonstrate the superiority and effectiveness of GEDIT. In addition, it has already been deployed in production at Baidu Maps, and it successfully keeps processing hundreds of thousands of Web documents every week. Statistics show that the proposed solution can save significant human effort and labor costs to deal with the same amount of documents, which confirms that it is a practical way for POI accessibility maintenance.
Year
DOI
Venue
2021
10.1145/3459637.3481924
Conference on Information and Knowledge Management
DocType
Citations 
PageRank 
Conference
1
0.41
References 
Authors
0
7
Name
Order
Citations
PageRank
Yibo Sun141.81
Jizhou Huang2587.65
Chunyuan Yuan3114.26
Miao Fan414016.04
Haifeng Wang580694.25
Ming Liu6132.98
Bing Qin7107672.82