Abstract | ||
---|---|---|
Human hearing and human speech are intrinsically tied together, as the properties of speech almost certainly developed in order to be heard by human ears. As a result of this connection, it has been shown that certain properties of human hearing are mimicked within data-driven systems that are trained to understand human speech. In this paper, we further explore this phenomenon by measuring the spectro-temporal responses of data-derived filters in a front-end convolutional layer of a deep network trained to classify the phonemes of clean speech. The analyses show that the filters do indeed exhibit spectro-temporal responses similar to those measured in mammals, and also that the filters exhibit an additional level of frequency selectivity, similar to the processing pipeline assumed within the Articulation Index. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICASSP.2019.8682787 | ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Keywords | Field | DocType |
Auditory system,Frequency modulation,Speech processing,Training,Sensitivity,Frequency measurement | Speech processing,Pattern recognition,Computer science,Auditory system,Articulation Index,Artificial intelligence,Frequency selectivity,Frequency modulation | Conference |
ISSN | ISBN | Citations |
1520-6149 | 978-1-4799-8131-1 | 0 |
PageRank | References | Authors |
0.34 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lucas Ondel | 1 | 35 | 7.16 |
ruizhi li | 2 | 51 | 12.01 |
Gregory Sell | 3 | 86 | 14.19 |
Hynek Hermansky | 4 | 3298 | 510.27 |