Title
Speaker Diarization with Lexical Information
Abstract
This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition. We propose a speaker diarization system that can incorporate word-level speaker turn probabilities with speaker embeddings into a speaker clustering process to improve the overall diarization accuracy. To integrate lexical and acoustic information in a comprehensive way during clustering, we introduce an adjacency matrix integration for spectral clustering. Since words and word boundary information for word-level speaker turn probability estimation are provided by a speech recognition system, our proposed method works without any human intervention for manual transcriptions. We show that the proposed method improves diarization performance on various evaluation datasets compared to the baseline diarization system using acoustic information only in speaker embeddings.
Year
DOI
Venue
2019
10.21437/Interspeech.2019-1947
INTERSPEECH
DocType
ISSN
Citations 
Conference
Interspeech 2019, 391-395
1
PageRank 
References 
Authors
0.36
0
7
Name
Order
Citations
PageRank
Tae Jin Park163.15
Kyu Jeong Han2859.10
Jing Huang32464186.09
Xiaodong He43858190.28
Bowen Zhou52212246.21
Georgiou Panayiotis642855.79
Narayanan Shrikanth75558439.23