Title
Lexica and corpora for speech-to-speech translation: a trilingual approach
Abstract
Creation of lexica and corpora for Catalan, Spanish and US-English is described. A lexicon is being created for speech recognition and synthesis including relevant information. The lexicon contains 50K common words selected to achieve a wide coverage on the chosen domains, and 50K additional entries in- cluding special application words, and proper nouns. Furthermore, a large trilingual spontaneous speech corpus has been created. These corpora, together with other available US-Englishdata, have been translated into their counterpart lan- guages. This is being used to investigate the language resources requirements for statistical machine translation.
Year
Venue
Keywords
2003
INTERSPEECH
speech recognition,noun
Field
DocType
Citations 
Speech corpus,Catalan,Computer science,Machine translation,Speech recognition,Lexicon,Speech to speech translation,Artificial intelligence,Natural language processing,Proper noun
Conference
4
PageRank 
References 
Authors
0.65
3
7
Name
Order
Citations
PageRank
David Conejero1152.61
Jesús Giménez221415.93
Victoria Arranz34918.75
Antonio Bonafonte469364.80
Neus Pascual540.65
Núria Castell6418.20
Asunción Moreno739944.97