Title
Discriminative input stream combination for conditional random field phone recognition
Abstract
In recent studies, we and others have found that conditional random fields (CRFs) can be effectively used to perform phone classification and recognition tasks by combining non-Gaussian distributed representations of acoustic input. In previous work by I. Heintz et al. (Latent phonetic analysis: Use of singular value decomposition to determine features for CRF phone recognition, Proc. ICASSP, pp. 4541-4544, 2008), we experimented with combining phonological feature posterior estimators and phone posterior estimators within a CRF framework; we found that treating posterior estimates as terms in a "phoneme information retrieval" task allowed for a more effective use of multiple posterior streams than directly feeding these acoustic representations to the CRF recognizer. In this paper, we examine some of the design choices in our previous work, and extend our results to up to six acoustic feature streams. We concentrate on feature design, rather than feature selection, to find the best way of combining features for introduction into a log-linear model. We improve upon our previous work to find that several different dimensionality reduction techniques (SVD, PARAFAC2, KLT), followed by a nonlinear transform provided by a multilayer perceptron, provides a significant gain in phone recognition accuracy on the TIMIT task.
Year
DOI
Venue
2009
10.1109/TASL.2009.2022204
IEEE Transactions on Audio, Speech & Language Processing
Keywords
Field
DocType
feature selection,conditional random field phone,crf phone recognition,discriminative input stream combination,previous work,phone classification,acoustic feature stream,phone posterior estimator,phonological feature posterior estimator,phone recognition accuracy,multiple posterior stream,feature design,feature extraction,hidden markov models,gaussian distribution,multilayer perceptron,singular value decomposition,automatic speech recognition,hmm,information retrieval,statistical distributions,matrix decomposition,log linear model,random processes,covariance matrix,conditional random field,stochastic processes,speech recognition
Conditional random field,TIMIT,Dimensionality reduction,Feature selection,Pattern recognition,Computer science,Speech recognition,Feature extraction,Artificial intelligence,Linear discriminant analysis,Hidden Markov model,Discriminative model
Journal
Volume
Issue
ISSN
17
8
1558-7916
Citations 
PageRank 
References 
5
0.89
29
Authors
3
Name
Order
Citations
PageRank
Ilana Heintz1142.04
Eric Fosler-Lussier269066.40
Chris Brew332144.44