Multi-View Cca-Based Acoustic Features For Phonetic Recognition Across Speakers And Domains - Citegraph

Paper Info

Title
Multi-View Cca-Based Acoustic Features For Phonetic Recognition Across Speakers And Domains

Abstract
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised learning of acoustic features when a second view (e. g., articulatory measurements) is available for some training data, and such projections have been used to improve phonetic frame classification. Here we study the behavior of CCA-based acoustic features on the task of phonetic recognition, and investigate to what extent they are speaker-independent or domain-independent. The acoustic features are learned using data drawn from the University of Wisconsin X-ray Microbeam Database (XRMB). The features are evaluated within and across speakers on XRMB data, as well as on out-of-domain TIMIT and MOCHA-TIMIT data. Experimental results show consistent improvement with the learned acoustic features over baseline MFCCs and PCA projections. In both speaker-dependent and cross-speaker experiments, phonetic error rates are improved by 4-9% absolute (10-23% relative) using CCA-based features over baseline MFCCs. In cross-domain phonetic recognition (training on XRMB and testing on MOCHA or TIMIT), the learned projections provide smaller improvements.

Year	Venue	Keywords
2013	2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)	multi-view learning, canonical correlation analysis, articulatory measurements, XRMB, MOCHA-TIMIT, TIMIT, speaker-independence, domain-independence
Field	DocType	ISSN
Kernel (linear algebra),Training set,TIMIT,Pattern recognition,Canonical correlation,Computer science,Speech recognition,Unsupervised learning,Speaker recognition,Artificial intelligence,Cepstral analysis	Conference	1520-6149
Citations	PageRank	References
34	1.16	22
Authors
2

Authors (2 rows)

Cited by (34 rows)

References (22 rows)

Name	Order	Citations	PageRank
R. Arora	1	489	35.97
Karen Livescu	2	1254	71.43

1