Distributed training for Conditional Random Fields - Citegraph

Paper Info

Title
Distributed training for Conditional Random Fields

Abstract
This paper proposes a novel distributed training method of Conditional Random Fields (CRFs) by utilizing the clusters built from commodity computers. The method employs Message Passing Interface (MPI) to deal with large-scale data in two steps. Firstly, the entire training data is divided into several small pieces, each of which can be handled by one node. Secondly, instead of adopting a root node to collect all features, a new criterion is used to split the whole feature set into non-overlapping subsets and ensure that each node maintains the global information of one feature subset. Experiments are carried out on the task of Chinese word segmentation (WS) with large scale data, and we observed significant reduction on both training time and space, while preserving the performance.

Year	DOI	Venue
2010	10.1109/NLPKE.2010.5587803	NLPKE
Keywords	Field	DocType
distributed strategy,chinese word segmentation,large-scale data,distributed training method,message passing interface,natural language processing,message passing,conditional random fields,accuracy,conditional random field	Training set,Conditional random field,Data mining,Computer science,Spacetime,Global information,Text segmentation,Theoretical computer science,Message Passing Interface,CRFS,Message passing	Conference
Volume	Issue	ISSN
null	null	null
ISBN	Citations	PageRank
978-1-4244-6896-6	4	0.47
References	Authors
10	4

Authors (4 rows)

Cited by (4 rows)

References (10 rows)

Name	Order	Citations	PageRank
Xiaojun Lin	1	4	0.47
Liang Zhao	2	4	0.47
Dianhai Yu	3	99	7.22
Xihong Wu	4	4	1.49

1