Title
Description of the NCU Chinese Word Segmentation and Named Entity Recognition System for SIGHAN Bakeoff 2006
Abstract
Asian languages are far from most west- ern-style in their non-separate word se- quence especially Chinese. The preliminary step of Asian-like language processing is to find the word boundaries between words. In this paper, we present a general purpose model for both Chinese word segmentation and named entity rec- ognition. This model was built on the word sequence classification with prob- ability model, i.e., conditional random fields (CRF). We used a simple feature set for CRF which achieves satisfactory clas- sification result on the two tasks. Our model achieved 91.00 in F rate in UPUC- Treebank data, and 78.71 for NER task.
Year
Venue
Field
2006
SIGHAN@COLING/ACL
Conditional random field,Tokenization (data security),Word lists by frequency,Phrase chunking,Computer science,Word error rate,Text segmentation,Speech recognition,Artificial intelligence,Natural language processing,Classifier (linguistics),Named-entity recognition
DocType
Citations 
PageRank 
Conference
9
0.52
References 
Authors
7
3
Name
Order
Citations
PageRank
Yu-Chieh Wu124723.16
Jie-Chi Yang235043.91
Qian-Xiang Lin3100.90