Data augmentation and feature extraction using variational autoencoder for acoustic modeling. - Citegraph

Paper Info

Title
Data augmentation and feature extraction using variational autoencoder for acoustic modeling.

Abstract
A data augmentation and feature extraction method using a variational autoencoder (VAE) for acoustic modeling is described. A VAE is a generative model based on variational Bayesian learning using a deep learning framework. A VAE can extract latent values its input variables to generate new information. VAEs are widely used to generate pictures and sentences. In this paper, a VAE is applied to speech corpus data augmentation and feature vector extraction from speech for acoustic modeling. First, the size of a speech corpus is doubled by encoding latent variables extracted from original utterances using a VAE, framework. The latent variables extracted from speech waveforms have latent "meanings" of the waveforms. Therefore, latent variables can be used as acoustic features for automatic speech recognition (ASR). This paper experimentally shows the effectiveness of data augmentation using a VAE, framework and that latent variable-based features can be utilized in ASR.

Year	Venue	Field
2017	Asia-Pacific Signal and Information Processing Association Annual Summit and Conference	Speech corpus,Feature vector,Autoencoder,Bayesian inference,Pattern recognition,Computer science,Latent variable,Feature extraction,Artificial intelligence,Deep learning,Generative model
DocType	ISSN	Citations
Conference	2309-9402	0
PageRank	References	Authors
0.34	0	1

Authors (1 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hiromitsu Nishizaki	1	163	29.49

1