Lexica and corpora for speech-to-speech translation: a trilingual approach - Citegraph

Paper Info

Title
Lexica and corpora for speech-to-speech translation: a trilingual approach

Abstract
Creation of lexica and corpora for Catalan, Spanish and US-English is described. A lexicon is being created for speech recognition and synthesis including relevant information. The lexicon contains 50K common words selected to achieve a wide coverage on the chosen domains, and 50K additional entries in- cluding special application words, and proper nouns. Furthermore, a large trilingual spontaneous speech corpus has been created. These corpora, together with other available US-Englishdata, have been translated into their counterpart lan- guages. This is being used to investigate the language resources requirements for statistical machine translation.

Year	Venue	Keywords
2003	INTERSPEECH	speech recognition,noun
Field	DocType	Citations
Speech corpus,Catalan,Computer science,Machine translation,Speech recognition,Lexicon,Speech to speech translation,Artificial intelligence,Natural language processing,Proper noun	Conference	4
PageRank	References	Authors
0.65	3	7

Authors (7 rows)

Cited by (4 rows)

References (3 rows)

Name	Order	Citations	PageRank
David Conejero	1	15	2.61
Jesús Giménez	2	214	15.93
Victoria Arranz	3	49	18.75
Antonio Bonafonte	4	693	64.80
Neus Pascual	5	4	0.65
Núria Castell	6	41	8.20
Asunción Moreno	7	399	44.97

1