Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis | 0 | 0.34 | 2022 |
Region-to-Region Kernel Interpolation of Acoustic Transfer Functions Constrained by Physical Properties | 0 | 0.34 | 2022 |
J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis | 0 | 0.34 | 2022 |
Physics-Informed Convolutional Neural Network with Bicubic Spline Interpolation for Sound Field Estimation | 0 | 0.34 | 2022 |
Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation | 0 | 0.34 | 2022 |
Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds | 0 | 0.34 | 2022 |
Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS | 0 | 0.34 | 2022 |
Head-Related Transfer Function Interpolation From Spatially Sparse Measurements Using Autoencoder With Source Position Conditioning | 0 | 0.34 | 2022 |
Region-Restricted Sensor Placement Based on Gaussian Process for Sound Field Estimation | 0 | 0.34 | 2022 |
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History | 0 | 0.34 | 2022 |
Convex and Differentiable Formulation for Inverse Problems in Hilbert Spaces with Nonlinear Clipping Effects | 0 | 0.34 | 2021 |
DISENTANGLED SPEAKER AND LANGUAGE REPRESENTATIONS USING MUTUAL INFORMATION MINIMIZATION AND DOMAIN ADAPTATION FOR CROSS-LINGUAL TTS | 0 | 0.34 | 2021 |
Speech Enhancement by Noise Self-Supervised Rank-Constrained Spatial Covariance Matrix Estimation via Independent Deeply Learned Matrix Analysis | 0 | 0.34 | 2021 |
Real-Time Full-Band Voice Conversion With Sub-Band Modeling And Data-Driven Phase Estimation Of Spectral Differentials | 0 | 0.34 | 2021 |
DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching | 0 | 0.34 | 2021 |
Spatial Active Noise Control Based On Kernel Interpolation Of Sound Field | 0 | 0.34 | 2021 |
Amplitude Matching: Majorization–Minimization Algorithm for Sound Field Control Only with Amplitude Constraint | 0 | 0.34 | 2021 |
Independent deeply learned matrix analysis with automatic selection of stable microphone-wise update and fast sourcewise update of demixing matrix. | 0 | 0.34 | 2021 |
DEFICIENT BASIS ESTIMATION OF NOISE SPATIAL COVARIANCE MATRIX FOR RANK-CONSTRAINED SPATIAL COVARIANCE MATRIX ESTIMATION METHOD IN BLIND SPEECH EXTRACTION | 0 | 0.34 | 2021 |
Multichannel Audio Source Separation with Independent Deeply Learned Matrix Analysis Using Product of Source Models | 0 | 0.34 | 2021 |
Kernel interpolation of acoustic transfer function between regions considering reciprocity | 0 | 0.34 | 2020 |
Binaural Rendering From Distributed Microphone Signals Considering Loudspeaker Distance in Measurements | 0 | 0.34 | 2020 |
Joint-Diagonalizability-Constrained Multichannel Nonnegative Matrix Factorization Based On Multivariate Complex Sub-Gaussian Distribution | 0 | 0.34 | 2020 |
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain | 1 | 0.36 | 2020 |
Sensor Placement In Arbitrarily Restricted Region For Field Estimation Based On Gaussian Process | 0 | 0.34 | 2020 |
Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU. | 0 | 0.34 | 2020 |
Phase reconstruction from amplitude spectrograms based on directional-statistics deep neural networks | 1 | 0.48 | 2020 |
DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus. | 0 | 0.34 | 2020 |
Mutual-Information-Based Sensor Placement for Spatial Sound Field Recording | 0 | 0.34 | 2020 |
Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis. | 0 | 0.34 | 2020 |
Generative Moment Matching Network-Based Neural Double-Tracking for Synthesized and Natural Singing Voices. | 0 | 0.34 | 2020 |
End-to-End Text-to-Speech Synthesis with Unaligned Multiple Language Units Based on Attention. | 1 | 0.37 | 2020 |
Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space. | 1 | 0.35 | 2020 |
Robust Gridless Sound Field Decomposition Based on Structured Reciprocity Gap Functional in Spherical Harmonic Domain | 1 | 0.38 | 2019 |
Prosody Correction Preserving Speaker Individuality For Chinese-Accented Japanese Hmm-Based Text-To-Speech Synthesis | 0 | 0.34 | 2019 |
Feedforward Spatial Active Noise Control Based on Kernel Interpolation of Sound Field | 1 | 0.37 | 2019 |
Generative Moment Matching Network-Based Random Modulation Post-Filter For Dnn-Based Singing Voice Synthesis And Neural Double-Tracking | 0 | 0.34 | 2019 |
Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra | 0 | 0.34 | 2019 |
Evaluation of Multichannel Hearing Aid System by Rank-Constrained Spatial Covariance Matrix Estimation | 0 | 0.34 | 2019 |
Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation. | 8 | 0.62 | 2019 |
Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method | 4 | 0.48 | 2019 |
Generalized independent low-rank matrix analysis using heavy-tailed distributions for blind source separation | 8 | 0.56 | 2018 |
Prosody-aware subword embedding considering Japanese intonation systems and its application to DNN-based multi-dialect speech synthesis | 0 | 0.34 | 2018 |
Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech | 0 | 0.34 | 2018 |
CPJD Corpus: Crowdsourced Parallel Speech Corpus of Japanese Dialects. | 0 | 0.34 | 2018 |
Sound Field Recording Using Distributed Microphones Based on Harmonic Analysis of Infinite Order. | 5 | 0.60 | 2018 |
Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks. | 12 | 0.58 | 2018 |
Low Latency and High Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot. | 2 | 0.37 | 2017 |
Voice Conversion Using Input-To-Output Highway Networks | 4 | 0.47 | 2017 |
Voice Conversion Using Sequence-To-Sequence Learning Of Context Posterior Probabilities | 5 | 0.38 | 2017 |