Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system - Citegraph

Paper Info

Title
Robust anchorperson detection based on audio streams using a hybrid I-vector and DNN system

Abstract
Anchorperson segment detection enables efficient video content indexing for information retrieval. Anchorperson detection based on audio analysis has gained popularity due to lower computational complexity and satisfactory performance. This paper presents a robust framework using a hybrid I-vector and deep neural network (DNN) system to perform anchorperson detection based on audio streams of video content. The proposed system first applies I-vector to extract speaker identity features from the audio data. With the extracted speaker identity features, a DNN classifier is then used to verify the claimed anchorperson identity. In addition, subspace feature normalization (SFN) is incorporated into the hybrid system for robust feature extraction to compensate the audio mismatch issues caused by recording devices. An anchorperson verification experiment was conducted to evaluate the equal error rate (EER) of the proposed hybrid system. Experimental results demonstrate that the proposed system outperforms the state-of-the-art hybrid I-vector and support vector machine (SVM) system. Moreover, the proposed system was further enhanced by integrating SFN to effectively compensate the audio mismatch issues in anchorperson detection tasks.

Year	DOI	Venue
2014	10.1109/APSIPA.2014.7041717	APSIPA
Keywords	Field	DocType
learning (artificial intelligence),dnn system,sfn,subspace feature normalization,information retrieval,deep neural network system,speaker recognition,computational complexity,recording devices,feature extraction,image classification,audio streams,audio analysis,audio signal processing,video content indexing,audio mismatch issues,hybrid i-vector,anchorperson verification experiment,speaker identity feature extraction,dnn classifier,equal error rate evaluation,video retrieval,robust anchorperson segment detection,neural nets,vectors,decision support systems,robustness,indexing,support vector machines	Pattern recognition,Computer science,Support vector machine,Word error rate,Search engine indexing,Speech recognition,Feature extraction,Robustness (computer science),Audio analyzer,Artificial intelligence,Artificial neural network,Hybrid system	Conference
ISSN	Citations	PageRank
2309-9402	3	0.40
References	Authors
12	9

Authors (9 rows)

Cited by (3 rows)

References (12 rows)

Name	Order	Citations	PageRank
Yun-Fan Chang	1	21	2.76
Payton Lin	2	7	2.86
Shao-Hua Cheng	3	12	2.33
Kai-Hsuan Chan	4	10	3.23
Yi-Chong Zeng	5	95	15.33
Chia-Wei Liao	6	108	13.55
Wen-Tsung Chang	7	40	11.18
Yu-Chiang Frank Wang	8	914	61.63
Yu Tsao	9	60	16.52

1