Abstract | ||
---|---|---|
Creation of lexica and corpora for Catalan, Spanish and US-English is described. A lexicon is being created for speech recognition and synthesis including relevant information. The lexicon contains 50K common words selected to achieve a wide coverage on the chosen domains, and 50K additional entries in- cluding special application words, and proper nouns. Furthermore, a large trilingual spontaneous speech corpus has been created. These corpora, together with other available US-Englishdata, have been translated into their counterpart lan- guages. This is being used to investigate the language resources requirements for statistical machine translation. |
Year | Venue | Keywords |
---|---|---|
2003 | INTERSPEECH | speech recognition,noun |
Field | DocType | Citations |
Speech corpus,Catalan,Computer science,Machine translation,Speech recognition,Lexicon,Speech to speech translation,Artificial intelligence,Natural language processing,Proper noun | Conference | 4 |
PageRank | References | Authors |
0.65 | 3 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
David Conejero | 1 | 15 | 2.61 |
Jesús Giménez | 2 | 214 | 15.93 |
Victoria Arranz | 3 | 49 | 18.75 |
Antonio Bonafonte | 4 | 693 | 64.80 |
Neus Pascual | 5 | 4 | 0.65 |
Núria Castell | 6 | 41 | 8.20 |
Asunción Moreno | 7 | 399 | 44.97 |