Abstract | ||
---|---|---|
This paper presents a novel probabilistic framework for localizing multiple speakers with a microphone array. In this framework, the generalized cross correlation function (GCC) of each microphone pair is interpreted as a probability distribution of the time difference of arrival (TDOA) and subsequently approximated as a Gaussian mixture. The distribution parameters are estimated with a weighted expectation maximization algorithm. Then, the joint distribution of the TDOA Gaussian mixtures is mapped to a multimodal distribution in the location space, where each mode represents a potential source location. The approach taken here performs the localization by 1) reducing the search space to some regions that are likely to contain a source and then 2) extracting the actual speaker locations with a numerical optimization algorithm. The effectiveness of the proposed approach is shown using the AV16.3 corpus. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ICASSP.2013.6638402 | 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | DocType | ISSN |
Microphone arrays, localization, multiple speakers, Gaussian mixture, steered response power | Conference | 1520-6149 |
Citations | PageRank | References |
5 | 0.44 | 15 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Youssef Oualil | 1 | 43 | 7.89 |
Mathew Magimai-Doss | 2 | 516 | 54.76 |
friedrich faubel | 3 | 80 | 8.89 |
dietrich klakow | 4 | 756 | 98.76 |