Title
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain
Abstract
Blind source separation exploiting multichannel information has long been a popular topic, and recently proposed methods based on the local Gaussian model have shown promising results despite its high computational cost for the case of many microphone signals. The low updating speed for such a model is mainly due to the inversion of a spatial covariance matrix, for which the complexity increases with the number of microphones, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$M$</tex-math></inline-formula> , and is generally of order <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M^3)$</tex-math></inline-formula> . Several projection-based approaches that attempt to concentrate energy on the diagonal part of the spatial covariance matrix have been introduced to circumvent the matrix inversion, which can reduce the complexity to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M)$</tex-math></inline-formula> . In this article, we focus on the fast Fourier transform as a projection method because the energy concentration on the diagonal can be efficiently achieved compared with other projection-based methods. For the case where the diagonalization is imperfect, for example, owing to discontinuities at the edge of a linear array, we also developed a more robust algorithm approximating the tri-diagonal part of the spatial covariance matrix, which requires a complexity of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M^2)$</tex-math></inline-formula> for the inversion by applying the Thomas algorithm. To remove the ad-hoc integration of post clustering after the decomposition, we also examine a self-clustering algorithm. Our evaluation shows better results than other previously proposed methods in terms of the separation quality under reverberant conditions as well as higher efficiency than multichannel non-negative matrix factorization.
Year
DOI
Venue
2020
10.1109/TASLP.2019.2948770
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
Field
DocType
Kernel,Covariance matrices,Computational efficiency,Microphones,Symmetric matrices,Source separation,Approximation algorithms
Covariance function,Pattern recognition,Matrix (mathematics),Computer science,Wavenumber,Algorithm,Artificial intelligence,Non-negative matrix factorization
Journal
Volume
Issue
ISSN
28
1
2329-9290
Citations 
PageRank 
References 
1
0.36
8
Authors
6
Name
Order
Citations
PageRank
Yuki Mitsufuji1369.50
Stefan Uhlich2357.62
Norihiro Takamune33510.18
Daichi Kitamura414221.21
Shoichi Koyama56817.76
Saruwatari, H.665290.81