Title
Building the Vietnamese Phrase Treebank by Improved Probabilistic Context-Free Grammars.
Abstract
Phrase Treebank is an important resource for Natural Language Processing research and practical application. For Vietnamese, we lack this kind of Treebank resources. This paper presents a method to construct the Vietnamese phrase Treebank by fusion of Vietnamese grammatical features and improved PCFG. This method can automatically analyze Vietnamese phrase structure tree and solve the problem of constructing the Vietnamese phrase Treebank. Firstly, Vietnamese grammatical feature set is established by analysis of Vietnamese grammatical features. Then, grammar rule set of PCFG model is obtained from manual annotation Vietnamese phrase trees. Finally, Vietnamese grammatical feature set is fused into improved PCFG model, which is regarded as a supplement, and the method completes the construction of Vietnamese phrase Treebank. The experimental results show that the accuracy of proposed PCFG model for the Vietnamese phrase Treebank construction reaches 89.12%. Compared to conventional PCFG model and the maximum entropy method, the accuracy obviously is improved.
Year
DOI
Venue
2016
10.1007/978-981-10-3635-4_7
Communications in Computer and Information Science
Keywords
DocType
Volume
Vietnamese,Phrase structure tree,Probabilistic context-free grammar,Grammatical rule set,Treebank introduction
Conference
668
ISSN
Citations 
PageRank 
1865-0929
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Ying Li140.77
Jianyi Guo22010.99
Zhengtao Yu346069.08
Yantuan Xian400.34
Yonghua Wen501.01