Title
MLP internal representation as discriminative features for improved speaker recognition
Abstract
Feature projection by non-linear discriminant analysis (NLDA) can substantially increase classification performance. In automatic speech recognition (ASR) the projection provided by the pre-squashed outputs from a one hidden layer multi-layer perceptron (MLP) trained to recognise speech sub-units (phonemes) has previously been shown to significantly increase ASR performance. An analogous approach cannot be applied directly to speaker recognition because there is no recognised set of "speaker sub-units" to provide a finite set of MLP target classes, and for many applications it is not practical to train an MLP with one output for each target speaker. In this paper we show that the output from the second hidden layer (compression layer) of an MLP with three hidden layers trained to identify a subset of 100 speakers selected at random from a set of 300 training speakers in Timit, can provide a 77% relative error reduction for common Gaussian mixture model (GMM) based speaker identification.
Year
DOI
Venue
2005
10.1007/11613107_5
NOLISP
Keywords
Field
DocType
speaker sub-units,hidden layer,mlp target class,improved speaker recognition,finite set,compression layer,target speaker,discriminative feature,speaker identification,training speaker,mlp internal representation,speaker recognition,hidden layer multi-layer perceptron,relative error,gaussian mixture model,multi layer perceptron
TIMIT,Pattern recognition,Computer science,Speech recognition,Speaker recognition,Speaker diarisation,Artificial intelligence,Linear discriminant analysis,Artificial neural network,Perceptron,Discriminative model,Mixture model
Conference
Volume
ISSN
ISBN
3817
0302-9743
3-540-31257-9
Citations 
PageRank 
References 
5
0.49
8
Authors
3
Name
Order
Citations
PageRank
Dalei Wu152350.87
Andrew Morris2544.12
jacques c koreman3202.50