Title | ||
---|---|---|
Multi-View Cca-Based Acoustic Features For Phonetic Recognition Across Speakers And Domains |
Abstract | ||
---|---|---|
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised learning of acoustic features when a second view (e. g., articulatory measurements) is available for some training data, and such projections have been used to improve phonetic frame classification. Here we study the behavior of CCA-based acoustic features on the task of phonetic recognition, and investigate to what extent they are speaker-independent or domain-independent. The acoustic features are learned using data drawn from the University of Wisconsin X-ray Microbeam Database (XRMB). The features are evaluated within and across speakers on XRMB data, as well as on out-of-domain TIMIT and MOCHA-TIMIT data. Experimental results show consistent improvement with the learned acoustic features over baseline MFCCs and PCA projections. In both speaker-dependent and cross-speaker experiments, phonetic error rates are improved by 4-9% absolute (10-23% relative) using CCA-based features over baseline MFCCs. In cross-domain phonetic recognition (training on XRMB and testing on MOCHA or TIMIT), the learned projections provide smaller improvements. |
Year | Venue | Keywords |
---|---|---|
2013 | 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | multi-view learning, canonical correlation analysis, articulatory measurements, XRMB, MOCHA-TIMIT, TIMIT, speaker-independence, domain-independence |
Field | DocType | ISSN |
Kernel (linear algebra),Training set,TIMIT,Pattern recognition,Canonical correlation,Computer science,Speech recognition,Unsupervised learning,Speaker recognition,Artificial intelligence,Cepstral analysis | Conference | 1520-6149 |
Citations | PageRank | References |
34 | 1.16 | 22 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
R. Arora | 1 | 489 | 35.97 |
Karen Livescu | 2 | 1254 | 71.43 |