Title | ||
---|---|---|
GEDIT: Geographic-Enhanced and Dependency-Guided Tagging for Joint POI and Accessibility Extraction at Baidu Maps |
Abstract | ||
---|---|---|
ABSTRACTProviding timely accessibility reminders (such as closed and relocated) of a point-of-interest (POI) plays a vital role in improving user satisfaction of finding places and making visiting decisions. However, it is difficult to keep the POI database in sync with the real-world counterparts due to the dynamic nature of business changes and innovations. To alleviate this problem, we formulate and present a practical solution that jointly extracts POI mentions and identifies their coupled accessibility labels from unstructured text (hereafter referred to as joint POI and accessibility extraction). We approach this task as a sequence tagging problem, where the goal is to produce (POI name, accessibility label) pairs from unstructured text. This task is challenging because of two main issues: (1) POI names are often newly-coined words so as to successfully register new entities or brands and (2) there may exist multiple pairs in the text, which necessitates dealing with one-to-many or many-to-one mapping to make each POI coupled with its matching accessibility label. To this end, we propose a Geographic-Enhanced and Dependency-guIded sequence Tagging (GEDIT) model to concurrently address the two challenges. First, to alleviate challenge #1, we develop a geographic-enhanced pre-trained model to learn the text representations, which is able to significantly relieve the problem of newly-coined words. Second, to mitigate challenge #2, we apply a relational graph convolutional network to learn the tree node representations from the parsed dependency tree, which enables us to establish a correlation between a POI and its accessibility label. Finally, we construct a neural sequence tagging model by integrating and feeding the previously pre-learned representations into a CRF layer. Extensive experiments conducted on a real-world dataset demonstrate the superiority and effectiveness of GEDIT. In addition, it has already been deployed in production at Baidu Maps, and it successfully keeps processing hundreds of thousands of Web documents every week. Statistics show that the proposed solution can save significant human effort and labor costs to deal with the same amount of documents, which confirms that it is a practical way for POI accessibility maintenance. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3459637.3481924 | Conference on Information and Knowledge Management |
DocType | Citations | PageRank |
Conference | 1 | 0.41 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yibo Sun | 1 | 4 | 1.81 |
Jizhou Huang | 2 | 58 | 7.65 |
Chunyuan Yuan | 3 | 11 | 4.26 |
Miao Fan | 4 | 140 | 16.04 |
Haifeng Wang | 5 | 806 | 94.25 |
Ming Liu | 6 | 13 | 2.98 |
Bing Qin | 7 | 1076 | 72.82 |