Abstract | ||
---|---|---|
This paper describes speaker diarization system on a NIST Rich Transcription 2007 (RT-07) Meeting Recognition evaluation data set for the task of Multiple Distant Microphone (MDM). Our implementation includes three components: initial clustering, non-speech removal and cluster purification. Initial clusters are generated using Directional of Arrival (DOA) information and bootstrap clustering. Multiple GMM modeling for speech/non-speech classification is employed for non-speech removal component. In addition, a novel system fusion strategy using information from Receiver Operating Curve (ROC) is proposed for non-speech removal component. Finally, consensus clustering approach together with iterative GMM clustering method is employed for speaker cluster purification. The system achieves the overall DER of 10.81%. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICASSP.2009.4960523 | ICASSP |
Keywords | Field | DocType |
multiple gmm modeling,cluster purification,meeting audio,non-speech removal,speaker diarization system,bootstrap clustering,novel system fusion strategy,non-speech classification,iterative gmm clustering method,non-speech removal component,initial clustering,speaker recognition,decision support systems,direction of arrival,receiver operating curve,speech,data mining,machine learning,receiver operator curve,probability density function,natural languages,speaker diarization,modeling,adaptive filters,erbium,sun,speech processing,tin | Speech processing,Pattern recognition,Computer science,Speech recognition,Speaker recognition,Consensus clustering,NIST,Artificial intelligence,Speaker diarisation,Cluster analysis,Bootstrapping (electronics),Microphone | Conference |
ISSN | Citations | PageRank |
1520-6149 | 5 | 0.52 |
References | Authors | |
7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tin Lay Nwe | 1 | 479 | 34.59 |
Hanwu Sun | 2 | 98 | 14.15 |
Haizhou Li | 3 | 3678 | 334.61 |
Susanto Rahardja | 4 | 652 | 102.05 |