Title
Towards a terminological resource for biomedical text mining
Abstract
One of the main challenges in biomedical text mining is the identification of terminology, which is a key factor for accessing and integrating the information stored in literature. Manual creation of biomedical terminologies cannot keep pace with the data that becomes available. Still, many of them have been used in attempts to recognise terms in literature, but their suitability for text mining has been questioned as substantial re-engineering is needed to tailor the resources for automatic processing. Several approaches have been suggested to automatically integrate and map between resources, but the problems of extensive variability of lexical representations and ambiguity have been revealed. In this paper we present a methodology to automatically maintain a biomedical terminological database, which contains automatically extracted terms, their mutual relationships, features and possible annotations that can be useful in text processing. In addition to TermDB, a database used for terminology management and storage, we present the following modules that are used to populate the database: TerMine (recognition, extraction and normalisation of terms from literature), AcroTerMine (extraction and clustering of acronyms and their long forms), AnnoTerm (annotation and classification of terms), and ClusTerm (extraction of term associations and clustering of terms).
Year
Venue
Field
2006
LREC
Text mining,Pace,Annotation,Terminology,Computer science,Biomedical text mining,Artificial intelligence,Natural language processing,Cluster analysis,Ambiguity,Text processing
DocType
Citations 
PageRank 
Conference
2
0.40
References 
Authors
15
3
Name
Order
Citations
PageRank
Goran Nenadic122813.18
Naoaki Okazaki264965.25
Sophia Ananiadou32658183.08