Title
Multi-View Cca-Based Acoustic Features For Phonetic Recognition Across Speakers And Domains
Abstract
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised learning of acoustic features when a second view (e. g., articulatory measurements) is available for some training data, and such projections have been used to improve phonetic frame classification. Here we study the behavior of CCA-based acoustic features on the task of phonetic recognition, and investigate to what extent they are speaker-independent or domain-independent. The acoustic features are learned using data drawn from the University of Wisconsin X-ray Microbeam Database (XRMB). The features are evaluated within and across speakers on XRMB data, as well as on out-of-domain TIMIT and MOCHA-TIMIT data. Experimental results show consistent improvement with the learned acoustic features over baseline MFCCs and PCA projections. In both speaker-dependent and cross-speaker experiments, phonetic error rates are improved by 4-9% absolute (10-23% relative) using CCA-based features over baseline MFCCs. In cross-domain phonetic recognition (training on XRMB and testing on MOCHA or TIMIT), the learned projections provide smaller improvements.
Year
Venue
Keywords
2013
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
multi-view learning, canonical correlation analysis, articulatory measurements, XRMB, MOCHA-TIMIT, TIMIT, speaker-independence, domain-independence
Field
DocType
ISSN
Kernel (linear algebra),Training set,TIMIT,Pattern recognition,Canonical correlation,Computer science,Speech recognition,Unsupervised learning,Speaker recognition,Artificial intelligence,Cepstral analysis
Conference
1520-6149
Citations 
PageRank 
References 
34
1.16
22
Authors
2
Name
Order
Citations
PageRank
R. Arora148935.97
Karen Livescu2125471.43