Abstract | ||
---|---|---|
Typically data collection, transcription, language model generation, and deployment are separate phases of creating a spoken language interface. An unfortunate consequence of this is that the recognizer usually remains a static element of systems often deployed in dynamic environments. By providing an API for human intelligence, Amazon Mechanical Turk changes the way system developers can construct spoken language systems. In this work, we describe an architecture that automates and connects these four phases, effectively allowing the developer to grow a spoken language interface. In particular, we show that a human-in-the-loop programming paradigm, in which workers transcribe utterances behind the scenes, can alleviate the need for expert guidance in language model construction. We demonstrate the utility of these organic language models in a voice-search interface for photographs. |
Year | Venue | Keywords |
---|---|---|
2011 | 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5 | organic speech systems, language modeling |
Field | DocType | Citations |
Architecture,Software deployment,Programming paradigm,Data control language,Human intelligence,Computer science,Interface description language,Speech recognition,Language model,Spoken language | Conference | 7 |
PageRank | References | Authors |
0.65 | 10 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ian McGraw | 1 | 253 | 24.41 |
James Glass | 2 | 3123 | 413.63 |
Stephanie Seneff | 3 | 2075 | 364.35 |