Abstract | ||
---|---|---|
A cancer registry is a critical and massive database for which various types of domain knowledge are needed and whose maintenance requires labor-intensive data curation. In order to facilitate the curation process for building a high-quality and integrated cancer registry database, we compiled a cross-hospital corpus and applied neural network methods to develop a natural language processing system for extracting cancer registry variables buried in unstructured pathology reports. The performance of the developed networks was compared with various baselines using standard micro-precision, recall and F-measure. Furthermore, we conducted experiments to study the feasibility of applying transfer learning to rapidly develop a well-performing system for processing reports from different sources that might be presented in different writing styles and formats. The results demonstrate that the transfer learning method enables us to develop a satisfactory system for a new hospital with only a few annotations and suggest more opportunities to reduce the burden of cancer registry curation. |
Year | DOI | Venue |
---|---|---|
2020 | 10.18653/v1/2020.clinicalnlp-1.22 | ClinicalNLP@EMNLP |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
6 | 17 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yan-Jie Lin | 1 | 0 | 0.34 |
Hong-Jie Dai | 2 | 288 | 21.58 |
You-Chen Zhang | 3 | 0 | 0.68 |
Chung-Yang Wu | 4 | 0 | 0.34 |
Yu-Cheng Chang | 5 | 83 | 8.02 |
Pin-Jou Lu | 6 | 0 | 0.34 |
Chih-Jen Huang | 7 | 0 | 0.34 |
Yu-Tsang Wang | 8 | 0 | 0.34 |
Hui-Min Hsieh | 9 | 0 | 0.34 |
Kun-San Chao | 10 | 0 | 0.34 |
Tsang-Wu Liu | 11 | 0 | 0.34 |
I-Shou Chang | 12 | 0 | 0.34 |
Yi-Hsin Connie Yang | 13 | 0 | 0.34 |
Ti-Hao Wang | 14 | 0 | 0.34 |
Ko-Jiunn Liu | 15 | 0 | 0.34 |
Li-Tzong Chen | 16 | 0 | 0.34 |
Sheau-Fang Yang | 17 | 0 | 0.34 |