Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain - Citegraph

Paper Info

Title
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain

Abstract
Blind source separation exploiting multichannel information has long been a popular topic, and recently proposed methods based on the local Gaussian model have shown promising results despite its high computational cost for the case of many microphone signals. The low updating speed for such a model is mainly due to the inversion of a spatial covariance matrix, for which the complexity increases with the number of microphones, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$M$</tex-math></inline-formula> , and is generally of order <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M^3)$</tex-math></inline-formula> . Several projection-based approaches that attempt to concentrate energy on the diagonal part of the spatial covariance matrix have been introduced to circumvent the matrix inversion, which can reduce the complexity to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M)$</tex-math></inline-formula> . In this article, we focus on the fast Fourier transform as a projection method because the energy concentration on the diagonal can be efficiently achieved compared with other projection-based methods. For the case where the diagonalization is imperfect, for example, owing to discontinuities at the edge of a linear array, we also developed a more robust algorithm approximating the tri-diagonal part of the spatial covariance matrix, which requires a complexity of <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$O(M^2)$</tex-math></inline-formula> for the inversion by applying the Thomas algorithm. To remove the ad-hoc integration of post clustering after the decomposition, we also examine a self-clustering algorithm. Our evaluation shows better results than other previously proposed methods in terms of the separation quality under reverberant conditions as well as higher efficiency than multichannel non-negative matrix factorization.

Year	DOI	Venue
2020	10.1109/TASLP.2019.2948770	IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords	Field	DocType
Kernel,Covariance matrices,Computational efficiency,Microphones,Symmetric matrices,Source separation,Approximation algorithms	Covariance function,Pattern recognition,Matrix (mathematics),Computer science,Wavenumber,Algorithm,Artificial intelligence,Non-negative matrix factorization	Journal
Volume	Issue	ISSN
28	1	2329-9290
Citations	PageRank	References
1	0.36	8
Authors
6

Authors (6 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Yuki Mitsufuji	1	36	9.50
Stefan Uhlich	2	35	7.62
Norihiro Takamune	3	35	10.18
Daichi Kitamura	4	142	21.21
Shoichi Koyama	5	68	17.76
Saruwatari, H.	6	652	90.81

1