A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature - Citegraph

Paper Info

Title
A framework for semi-automatic identification, disambiguation and storage of protein-related abbreviations in scientific literature

Abstract
We propose a framework for identifying, disambiguating and storing protein-related abbreviations as found in the full texts of scientific papers, in order to build and maintain a publicly available abbreviation repository via a semi-automatic process. This process involves information extraction methods and techniques for acronym identification and resolution, based on lexical clues and syntactical, largely domain-independent criteria. A dictionary and an ontology for proteins provide the means for matching and disambiguating the biological entities. User feedback is gathered at the end of the process and the confirmed entries are then stored and made available to the scientific community for further reviewing.

Year	DOI	Venue
2011	10.1109/ICDEW.2011.5767646	Data Engineering Workshops
Keywords	DocType	ISBN
protein-related abbreviation,full text,domain-independent criterion,scientific community,acronym identification,information extraction method,scientific literature,biological entity,scientific paper,semi-automatic process,semi-automatic identification,available abbreviation repository,lexical clue,information retrieval,data mining,natural language processing,ontology,dictionary,dictionaries,bioinformatics,proteins,information extraction	Conference	978-1-4244-9194-0
Citations	PageRank	References
9	0.61	12
Authors
3

Authors (3 rows)

Cited by (9 rows)

References (12 rows)

Name	Order	Citations	PageRank
Paolo Atzeni	1	1833	742.95
Fabio Polticelli	2	68	8.97
Daniele Toti	3	105	13.86

1