Abstract | ||
---|---|---|
Zero-resource speech processing involves the automatic analysis of a collection of speech data in a completely unsupervised fashion without the benefit of any transcriptions or annotations of the data. In this paper, our zero-resource system seeks to automatically discover important words, phrases and topical themes present in an audio corpus. This system employs a segmental dynamic time warping (S-DTW) algorithm for acoustic pattern discovery in conjunction with a probabilistic model which treats the topic and pseudo-word identity of each discovered pattern as hidden variables. By applying an Expectation-Maximization (EM) algorithm, our system estimates the latent probability distributions over the pseudo-words and topics associated with the discovered patterns. Using this information, we produce acoustic summaries of the dominant topical themes of the audio document collection. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ICASSP.2013.6639335 | 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
Zero-resource speech processing, spoken term discovery, speech summarization | Speech corpus,Speech processing,Speech coding,Dynamic time warping,Computer science,Natural language processing,Artificial intelligence,Audio signal processing,Speech analytics,Pattern recognition,Audio mining,Speech recognition,Acoustic model | Conference |
ISSN | Citations | PageRank |
1520-6149 | 11 | 0.61 |
References | Authors | |
13 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
David F. Harwath | 1 | 63 | 8.34 |
Timothy J. Hazen | 2 | 880 | 81.55 |
James Glass | 3 | 3123 | 413.63 |