Abstract | ||
---|---|---|
Conventional methods for finding audio in databases typically search text labels, rather than the audio itself. This can be problematic as labels may be missing, irrelevant to the audio content, or not known by users. Query by vocal imitation lets users query using vocal imitations instead. To do so, appropriate audio feature representations and effective similarity measures of imitations and orig... |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/TASLP.2018.2868428 | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
Keywords | Field | DocType |
Feature extraction,Task analysis,Neurons,Speech processing,Convolutional neural networks,Databases,Poles and towers | Architecture,Convolutional neural network,Computer science,Transfer of learning,Speech recognition,Imitation,Concatenation,Encoder,Spoken language,Environmental sound classification | Journal |
Volume | Issue | ISSN |
27 | 2 | 2329-9290 |
Citations | PageRank | References |
1 | 0.37 | 22 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yichi Zhang | 1 | 3 | 1.10 |
Bryan Pardo | 2 | 830 | 63.92 |
Zhiyao Duan | 3 | 305 | 26.86 |