Cross Modal Audio Search And Retrieval With Joint Embeddings Based On Text And Audio - Citegraph

Paper Info

Title
Cross Modal Audio Search And Retrieval With Joint Embeddings Based On Text And Audio

Abstract
Existing audio search engines use one of two approaches: matching text-text or audio-audio pairs. In the former, text queries are matched to semantically similar words in an index of audio metadata to retrieve corresponding audio clips or segments, while in the latter, audio signals are directly used to retrieve acoustically-similar recordings from an audio database. However, independent treatment of text and audio has precluded information exchange between the two modalities. This is a problem because similarity in language does not always imply similarity in acoustics, and vice versa. Moreover, independent modeling can be error prone especially for ad hoc, user-generated recordings, which are noisy in both audio and their associated textual labels. To overcome this limitation, we propose a framework that learns joint embeddings from a shared lexico-acoustic space, where vectors from either modality can be mapped together and compared directly. Thus, we improve semantic knowledge and enable the use of either text or audio queries to search and retrieve audio. Our results break new ground for a cross-modal audio search engine, and further exploration of lexico-acoustic spaces.

Year	DOI	Venue
2019	10.1109/icassp.2019.8682632	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords	Field	DocType
Joint Audio-Text Embedding, Cross Modal Retrieval, Audio Search Engine, Content-Based Audio Retrieval, Query by Example, Siamese Neural Network	Audio signal,Mel-frequency cepstrum,Metadata,Search engine,Pattern recognition,Computer science,Information exchange,Audio search engine,Speech recognition,Artificial intelligence,Modal,Semantics	Conference
ISSN	Citations	PageRank
1520-6149	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Benjamin Elizalde	1	359	22.38
Shuayb Zarar	2	0	2.70
Raj, Bhiksha	3	2094	204.63

1