Tackling unseen acoustic conditions in query-by-example search using time and frequency convolution for multilingual deep bottleneck features - Citegraph

Paper Info

Title
Tackling unseen acoustic conditions in query-by-example search using time and frequency convolution for multilingual deep bottleneck features

Abstract
Standard keyword spotting based on Automatic Speech Recognition (ASR) cannot be used on low-and no-resource languages due to lack of annotated data and/or linguistic resources. In recent years, query-by-example (QbE) has emerged as an alternate way to enroll and find spoken queries in large audio corpora, yet mismatched and unseen acoustic conditions remain a difficult challenge given the lack of enrollment data. This paper revisits two neural network architectures developed for noise and channel-robust ASR, and applies them to building a state-of-art multilingual QbE system. By applying convolution in time or frequency across the spectrum, those convolutional bottlenecks learn more discriminative deep bottleneck features. In conjunction with dynamic time warping (DTW), these features enable robust QbE systems. We use the MediaEval 2014 QUESST data to evaluate robustness against language and channel mismatches, and add several levels of artificial noise to the data to evaluate performance in degraded acoustic environments. We also assess performance on an Air Traffic Control QbE task with more realistic and higher levels of distortion in the push-to-talk domain.

Year	DOI	Venue
2017	10.1109/ASRU.2017.8268915	2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords	DocType	ISBN
query-by-example,multilingual bottleneck,convolutional neural networks,noise robustness,channel robustness	Conference	978-1-5090-4789-5
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Julien van Hout	1	54	6.07
Vikramjit Mitra	2	299	24.83
Horacio Franco	3	543	72.04
Chris Bartels	4	0	0.34
Dimitra Vergyri	5	373	36.97

1