Abstract | ||
---|---|---|
The iVector representation of speech utterances is currently widely used in speaker and language recognition tasks. In this paper, an iVector extractor using pre-trained neural networks is proposed for speaker verification. It can be viewed as an alternative to the classical total variability approach. In the proposed system, a neural network with bottleneck layer is trained with speaker labeled utterances, then we utilize the bottleneck features of the network to represent the input utterance. As a new iVector representation, it shows comparable performance with the state-of-the-art Total Variability Model (TVM) based iVector extraction system on NIST 2008 SRE. We further achieve a 10% reduction in equal error rates with combination of the proposed extraction system and the TVM system. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/ISCSLP.2014.6936722 | ISCSLP |
Keywords | Field | DocType |
pretrained neural networks,total variability model,language recognition tasks,equal error rates,speech utterances,nist 2008 sre,ivector representation,speaker recognition,feature extraction,speaker recognition tasks,ivector extractor,tvm based ivector extraction system,bottleneck feature,neural nets,speaker verification,vectors,speaker labeled utterances,input utterance | Speaker verification,Bottleneck,Computer science,Utterance,Speaker recognition,Natural language processing,Artificial intelligence,Speaker diarisation,Artificial neural network,Pattern recognition,Speech recognition,NIST,Extractor | Conference |
Citations | PageRank | References |
1 | 0.36 | 3 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shanshan Zhang | 1 | 53 | 4.24 |
Rong Zheng | 2 | 14 | 3.83 |
Bo Xu | 3 | 241 | 36.59 |