Title
Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation
Abstract
Audio source separation is often used for the preprocessing of various tasks, and one of its ultimate goals is to construct a single versatile preprocessor that can handle every variety of audio signal. One of the most important varieties of the discrete-time audio signal is sampling frequency. Since it is usually task-specific, the versatile preprocessor must handle all the sampling frequencies required by the possible downstream tasks. However, conventional models based on deep neural networks (DNNs) are not designed for handling a variety of sampling frequencies. Thus, for unseen sampling frequencies, they may not work appropriately. In this paper, we propose sampling-frequency-independent (SFI) convolutional layers capable of handling various sampling frequencies. The core idea of the proposed layers comes from our finding that a convolutional layer can be viewed as a collection of digital filters and inherently depends on sampling frequency. To overcome this dependency, we propose an SFI structure that features analog filters and generates weights of a convolutional layer from the analog filters. By utilizing time- and frequency-domain analog-to-digital filter conversion techniques, we can adapt the convolutional layer for various sampling frequencies. As an example application, we construct an SFI version of a conventional source separation network. Through music source separation experiments, we show that the proposed layers enable separation networks to consistently work well for unseen sampling frequencies in objective and perceptual separation qualities. We also demonstrate that the proposed method outperforms a conventional method based on signal resampling when the sampling frequencies of input signals are significantly lower than the trained sampling frequency.
Year
DOI
Venue
2022
10.1109/TASLP.2022.3203907
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
Keywords
DocType
Volume
Convolution, Source separation, Finite impulse response filters, Task analysis, Time-frequency analysis, Time-domain analysis, Information filters, Audio source separation, analog-to-digital filter conversion, convolutional layer, deep neural networks
Journal
30
Issue
ISSN
Citations 
1
2329-9290
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Koichi Saito100.34
Tomohiko Nakamura2135.02
Kohei Yatabe31610.36
Saruwatari, H.465290.81