Title
Ensemble Statistical and Heuristic Models for Unsupervised Word Alignment
Abstract
Statistical word alignment models need large amounts of training data while they are weak in small-sized corpora. This paper proposes a new approach of an unsupervised hybrid word alignment technique using an ensemble learning method. This algorithm uses three base alignment models in several rounds to generate alignments. The ensemble algorithm uses a weighed scheme for resampling training data and a voting score to consider aggregated alignments. The underlying alignment algorithms used in this study include IBM Model 1, 2 and a heuristic method based on Dice measurement. Our experimental results show that by this approach, the alignment error rate could be improved by at least 15% for the base alignment models.
Year
DOI
Venue
2014
10.1109/ICMLA.2014.15
ICMLA
Keywords
Field
DocType
ensemble learning method,heuristic word alignment,word processing,statistical analysis,underlying alignment algorithms,dice measurement,unsupervised hybrid word alignment technique,statistical word alignment,ibm model,ensemble learning,statistical word alignment model,text analysis,small-sized corpora,unsupervised learning,boosting,mathematical model,computational modeling,hidden markov models,training data
Hybrid word,Heuristic,Pattern recognition,Computer science,Word error rate,Boosting (machine learning),Artificial intelligence,Dice,Hidden Markov model,Resampling,Ensemble learning,Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
19
Authors
3
Name
Order
Citations
PageRank
Mahsa Mohaghegh123.75
Hossein Sarrafzadeh200.34
Mehdi Mohammadi3109150.02