Title
Improving speaker diarization using social role information
Abstract
Speaker diarization systems for meetings commonly model acoustic and spatial information, ignoring that meetings are instances of human interactions. Recent studies have shown that social roles influence the interaction patterns of speakers. This paper proposes a novel method to integrate social roles information in the speaker diarization framework. First, we modify the minimum duration constraint in baseline diarization system by using role information to model the expected duration of speaker's turn. Furthermore, we also propose a social role n-gram model as prior information on speaker interaction patterns. The proposed method is integrated in the state-of-the-art diarization system to reduce the speaker error. Experiments are performed on AMI corpus which is annotated in terms of social roles. The proposed method reduces the speaker error by 16% relative to baseline HMM-GMM system. Furthermore, the paper also investigates the performance of the proposed method on other meeting scenarios like those from NIST Rich Transcription campaigns. Experiments on Rich Transcription meetings reveal that speaker error can be reduced by 13% relative to the baseline system, thus demonstrating the potential of the proposed method.
Year
DOI
Venue
2014
10.1109/ICASSP.2014.6853566
Acoustics, Speech and Signal Processing
Keywords
Field
DocType
Gaussian processes,hidden Markov models,mixture models,speech processing,AMI corpus,Gaussian mixture modeling,NIST Rich Transcription campaigns,Rich Transcription meetings,acoustic information,baseline HMM-GMM system,baseline diarization system,hidden Markov model,minimum duration constraint,social role information,social role n-gram model,spatial information,speaker diarization,speaker interaction patterns,speaker turn expected duration,HMM-GMM,Social Roles,Speaker diarization
Spatial analysis,Speech processing,Histogram,Pattern recognition,Computer science,Feature extraction,Speech recognition,NIST,Artificial intelligence,Speaker diarisation,Hidden Markov model,Mixture model
Conference
ISSN
Citations 
PageRank 
1520-6149
0
0.34
References 
Authors
13
3
Name
Order
Citations
PageRank
Ashtosh Sapru1102.55
Sree Harsha Yella2284.02
Herve Bourlard315237.75