Effective Document-Level Features For Chinese Patent Word Segmentation - Citegraph

Paper Info

Title
Effective Document-Level Features For Chinese Patent Word Segmentation

Abstract
A patent is a property right for an invention granted by the government to the inventor. Patents often have a high concentration of scientific and technical terms that are rare in everyday language. However, some scientific and technical terms usually appear with high frequency only in one specific patent. In this paper, we propose a pragmatic approach to Chinese word segmentation on patents where we train a sequence labeling model based on a group of novel document-level features. Experiments show that the accuracy of our model reached 96.3% (F-1 score) on the development set and 95.0% on a held-out test set.

Year	Venue	Field
2014	PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2	F1 score,Property rights,Sequence labeling,Computer science,Text segmentation,Speech recognition,Natural language processing,Artificial intelligence,Invention,Government,Test set
DocType	Volume	Citations
Conference	P14-2	2
PageRank	References	Authors
0.40	17	2

Authors (2 rows)

Cited by (2 rows)

References (17 rows)

Name	Order	Citations	PageRank
Si Li	1	14	7.29
Nianwen Xue	2	1654	117.65

1