Abstract | ||
---|---|---|
This paper presents a hybrid approach to improve word alignment with Statistical Modeling and Chunker for English-Hindi language pair. We first apply the standard word alignment technique to get an approximate alignment. The source and target language sentences are divided into chunks. The approximate word alignment is then used to align the chunks. The aligned chunks are then used to improve the original word alignment.The statistical model used here is IBM Model 1. CRF Chunker is used to break the English sentences into chunks. A shallow parser is used to break Hindi sentences into chunks. This paper demonstrates an increment in F-measure by approximately 7% and reduction in Alignment Error Rate (AER) by approximately 7% in comparison to the performance of IBM Model 1 for word alignment. Experiments of this paper are based on TDIL corpus of 1000 sentences. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1007/978-3-319-18111-0_43 | COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I |
Keywords | Field | DocType |
Word alignment, Chunk alignment, Natural language processing, Artificial Intelligence | Computer science,Speech recognition,Natural language processing,Artificial intelligence,Statistical model | Conference |
Volume | ISSN | Citations |
9041 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 15 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jyoti Srivastava | 1 | 0 | 1.01 |
S Sanyal | 2 | 0 | 1.01 |