Reference Extraction from Vietnamese Legal Documents - Citegraph

Paper Info

Title
Reference Extraction from Vietnamese Legal Documents

Abstract
Legal and regulatory texts are ubiquitous and important in our life. Automated processing of such documents using natural language processing and information retrieval techniques is desired. Many legal text processing problems require information extraction as a base component. In this paper, we address the task of extracting references from law and regulatory documents, which are necessary for recognition of the relations between documents and document parts, and other problems. We formulate the task as a sequence labeling problem and introduce several extraction models, consisting of both traditional (conditional random fields) and more advanced (deep neural networks) methods. In addition to features learned by deep networks, we investigate various types of manually engineered features that reflect the characteristics of legal documents. Our best model that combines bidirectional long short-term memory networks and conditional random fields achieves 95.35% in the F1 score on a corpus consisting of more than 11 thousand sentences from Vietnamese law and regulatory documents.

Year	DOI	Venue
2019	10.1145/3368926.3369731	Proceedings of the Tenth International Symposium on Information and Communication Technology
Keywords	Field	DocType
Bidirectional Long Short-Term Memory Networks, Conditional Random Fields, Legal Text, Reference Extraction	Conditional random field,F1 score,Sequence labeling,Computer science,Information extraction,Natural language processing,Artificial intelligence,Vietnamese,Deep neural networks,Text processing	Conference
ISBN	Citations	PageRank
978-1-4503-7245-9	0	0.34
References	Authors
0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ngo Xuan Bach	1	0	0.34
Nguyen Thi Thanh Thuy	2	0	0.34
Dang Bao Chien	3	0	0.34
Trieu Khuong Duy	4	0	0.34
To Minh Hien	5	0	0.34
Tu Minh Phuong	6	137	19.47

1