Description of the NCU Chinese Word Segmentation and Named Entity Recognition System for SIGHAN Bakeoff 2006 - Citegraph

Paper Info

Title
Description of the NCU Chinese Word Segmentation and Named Entity Recognition System for SIGHAN Bakeoff 2006

Abstract
Asian languages are far from most west- ern-style in their non-separate word se- quence especially Chinese. The preliminary step of Asian-like language processing is to find the word boundaries between words. In this paper, we present a general purpose model for both Chinese word segmentation and named entity rec- ognition. This model was built on the word sequence classification with prob- ability model, i.e., conditional random fields (CRF). We used a simple feature set for CRF which achieves satisfactory clas- sification result on the two tasks. Our model achieved 91.00 in F rate in UPUC- Treebank data, and 78.71 for NER task.

Year	Venue	Field
2006	SIGHAN@COLING/ACL	Conditional random field,Tokenization (data security),Word lists by frequency,Phrase chunking,Computer science,Word error rate,Text segmentation,Speech recognition,Artificial intelligence,Natural language processing,Classifier (linguistics),Named-entity recognition
DocType	Citations	PageRank
Conference	9	0.52
References	Authors
7	3

Authors (3 rows)

Cited by (9 rows)

References (7 rows)

Name	Order	Citations	PageRank
Yu-Chieh Wu	1	247	23.16
Jie-Chi Yang	2	350	43.91
Qian-Xiang Lin	3	10	0.90

1