Abstract | ||
---|---|---|
Incorrect normalization of text can be particularly damaging for applications like text-to-speech synthesis (TTS) or typing auto-correction, where the resulting normalization is directly presented to the user, versus feeding downstream applications. In this paper, we focus on abbreviation expansion for TTS, which requires a "do no harm", high precision approach yielding few expansion errors at the cost of leaving relatively many abbreviations un-expanded. In the context of a largescale, real-world TTS scenario, we present methods for training classifiers to establish whether a particular expansion is apt. We achieve a large increase in correct abbreviation expansion when combined with the baseline text normalization component of the TTS system, together with a substantial reduction in incorrect expansions. |
Year | Venue | Field |
---|---|---|
2014 | PROCEEDINGS OF THE 52ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2 | Hippocratic Oath,Normalization (statistics),Computer science,Artificial intelligence,Natural language processing,Text normalization |
DocType | Volume | Citations |
Conference | P14-2 | 0 |
PageRank | References | Authors |
0.34 | 6 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Brian Roark | 1 | 20 | 4.62 |
Richard Sproat | 2 | 195 | 16.56 |