Title | ||
---|---|---|
Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification |
Abstract | ||
---|---|---|
Most speaker recognition systems rely on short-term acoustic cepstral features for extracting the speaker-relevant information from the signal. But phonetic discriminant features, extracted by a bottle-neck multi-layer perceptron (MLP) on longer stretches of time, can provide a complementary information and have been adopted in speech transcription systems. We compare the speaker verification performance using cepstral features, discriminant features, and a concatenation of both followed by a dimension reduction. We consider two speaker recognition systems, one based on maximum likelihood linear regression (MLLR) super-vectors and the other on a state-of-the-art i-vector system with two session variability compensation schemes. Experiments are reported on a standard configuration of NIST SRE 2008 and 2010 databases. The results show that the phonetically discriminative MLP features retain speaker-specific information which is complementary to the short-term cepstral features. The performance improvement is obtained with both score domain and feature domain fusion and the speaker verification equal error rate (EER) is reduced up to 50% relative, compared to the best i-vector system using only cepstral features. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/LSP.2014.2323432 | Signal Processing Letters, IEEE |
Keywords | Field | DocType |
cepstral analysis,feature extraction,multilayer perceptrons,speaker recognition,speech processing,EER,MLLR super-vectors,NIST SRE 2008-2010 databases,bottle-neck multilayer perceptron,feature domain fusion,i-vector system,maximum likelihood linear regression super-vectors,phonetically discriminative MLP features,phonetically discriminative features,score domain,short-term acoustic cepstral features,speaker recognition systems,speaker verification equal error rate,speaker-relevant information,speech transcription systems,Bottleneck features,LDA,PCA,PLDA,i-vector,multi-layer perceptron,speaker verification | Dimensionality reduction,Pattern recognition,Computer science,Word error rate,Cepstrum,Feature extraction,Speech recognition,Speaker recognition,Speaker diarisation,Artificial intelligence,Discriminative model,Perceptron | Journal |
Volume | Issue | ISSN |
21 | 9 | 1070-9908 |
Citations | PageRank | References |
7 | 0.58 | 17 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Achintya Kumar Sarkar | 1 | 23 | 7.81 |
Cong-Thanh Do | 2 | 19 | 3.94 |
Viet Bac Le | 3 | 140 | 12.62 |
Claude Barras | 4 | 449 | 70.53 |