MLLR transforms of self-organized units as features in speaker recognition - Citegraph

Paper Info

Title
MLLR transforms of self-organized units as features in speaker recognition

Abstract
Using speaker adaptation parameters, such as maximum likelihood linear regression (MLLR) adaptation matrices, as features for speaker recognition (SR) has been shown to perform well and can also provide complementary information for fusion with other acoustic-based SR systems, such as GMM-based systems. In order to estimate the adaptation parameters, a speech recognizer in the SR domain is required which in turn requires transcribed training data for recognizer training. This limits the approach only to domains where training transcriptions are available. To generalize the adaptation parameter approach to domains without transcriptions, we propose the use of self-organized unit recognizers that can be trained without supervision (or transcribed data). We report results on the 2002 NIST speaker recognition evaluation (SRE2002) extended data set and show that using MLLR parameters estimated from SOU recognizers give comparable performance to systems using a matched recognizers. SOU recognizers also outperform those using cross-lingual recognizers. When we fused the SOU- and word recognizers, SR equal error rate (EER) can be reduced by another 15%. This suggests SOU recognizers can be useful whether or not transcribed data for recognition training are available.

Year	DOI	Venue
2012	10.1109/ICASSP.2012.6288891	ICASSP
Keywords	Field	DocType
acoustic-based sr systems,self-organized unit recognizers,sou recognizers,nist speaker recognition evaluation extended data set,maximum likelihood linear regression adaptation matrices,regression analysis,gmm-based systems,maximum likelihood estimation,matrix algebra,sr equal error rate,cross-lingual recognizers,sre2002 extended data set,self-organized units,matched recognizers,speaker recognition,sr eer,word recognizers,complementary information,gaussian processes,mllr transforms,unsupervised learning,speaker adaptation parameters,gaussian mixture model,speech recognition,support vector machines,strontium,hidden markov models,acoustics	Transcription (linguistics),Pattern recognition,Computer science,Support vector machine,Word error rate,Speech recognition,Unsupervised learning,Speaker recognition,NIST,Artificial intelligence,Gaussian process,Hidden Markov model	Conference
ISSN	ISBN	Citations
1520-6149 E-ISBN : 978-1-4673-0044-5	978-1-4673-0044-5	2
PageRank	References	Authors
0.37	7	6

Authors (6 rows)

Cited by (2 rows)

References (7 rows)

Name	Order	Citations	PageRank
Manhung Siu	1	464	61.40
Omer Lang	2	2	0.37
Herbert Gish	3	447	100.85
Stephen A. Lowe	4	8	1.27
Arthur Chan	5	239	15.28
Owen Kimball	6	83	17.82

1