Building trainable taggers in a web-based, UIMA-supported NLP workbench - Citegraph

Paper Info

Title
Building trainable taggers in a web-based, UIMA-supported NLP workbench

Abstract
Argo is a web-based NLP and text mining workbench with a convenient graphical user interface for designing and executing processing workflows of various complexity. The workbench is intended for specialists and non-technical audiences alike, and provides the ever expanding library of analytics compliant with the Unstructured Information Management Architecture, a widely adopted interoperability framework. We explore the flexibility of this framework by demonstrating workflows involving three processing components capable of performing self-contained machine learning-based tagging. The three components are responsible for the three distinct tasks of 1) generating observations or features, 2) training a statistical model based on the generated features, and 3) tagging unlabelled data with the model. The learning and tagging components are based on an implementation of conditional random fields (CRF); whereas the feature generation component is an analytic capable of extending basic token information to a comprehensive set of features. Users define the features of their choice directly from Argo's graphical interface, without resorting to programming (a commonly used approach to feature engineering). The experimental results performed on two tagging tasks, chunking and named entity recognition, showed that a tagger with a generic set of features built in Argo is capable of competing with task-specific solutions.

Year	Venue	Keywords
2012	ACL (System Demonstrations)	interoperability framework,processing workflows,tagging component,tagging task,uima-supported nlp workbench,comprehensive set,convenient graphical user interface,feature generation component,processing component,graphical interface,trainable taggers,generic set
Field	DocType	Volume
Conditional random field,Workbench,Computer science,Interoperability,Feature engineering,Graphical user interface,Artificial intelligence,Natural language processing,Web application,Analytics,Named-entity recognition,Machine learning	Conference	P12-3
Citations	PageRank	References
0	0.34	8
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (8 rows)

Name	Order	Citations	PageRank
Rafal Rak	1	382	18.30
BalaKrishna Kolluru	2	48	5.22
Sophia Ananiadou	3	2658	183.08

1