Abstract | ||
---|---|---|
Identifying sentence boundaries is an indispensable task for most natural language processing (NLP) systems. While extensive efforts have been devoted to mine biomedical text using NLP techniques, few attempts are specifically targeted at disambiguating sentence boundaries in biomedical literature, which has a number of unique features that can reduce the accuracy of algorithms designed for general English genre significantly. In order to increase the accuracy of sentence boundary identification for biomedical literature, we developed a method using a combination of heuristic and statistical strategies. Our approach does not require part-of-speech taggers or training procedures. Experiments with biomedical test corpora show our system significantly outperforms existing sentence boundary determination algorithms, particularly for full text biomedical literature. Our system is very fast and it should also be easily adaptable for sentence boundary determination in scientific literature from non-biomedical fields. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1007/978-3-540-70939-8_17 | CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing |
Keywords | Field | DocType |
nlp technique,sentence boundary identification,scientific literature,biomedical text,biomedical test,biomedical literature,identifying sentence boundary,sentence boundary determination algorithm,tagging sentence boundaries,sentence boundary determination,disambiguating sentence boundary,part of speech,algorithm design,natural language processing | Scientific literature,General english,Sentence boundary disambiguation,Heuristic,Computer science,Maximum entropy method,Speech recognition,Natural language processing,Artificial intelligence,Sentence | Conference |
Volume | ISSN | Citations |
4394 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 14 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Weijian Xuan | 1 | 57 | 4.23 |
Stanley J. Watson | 2 | 41 | 2.40 |
Fan Meng | 3 | 114 | 10.82 |