Name
Affiliation
Papers
JIANGYAN YI
Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China
52
Collaborators
Citations 
PageRank 
61
19
17.99
Referers 
Referees 
References 
74
416
138
Search Limit
100416
Title
Citations
PageRank
Year
Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features.00.342022
Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition00.342022
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation00.342022
ADD 2022: the first Audio Deep Synthesis Detection Challenge.00.342022
Continual Learning for Fake Audio Detection.10.362021
Rnn-transducer With Language Bias For End-to-end Mandarin-English Code-switching Speech Recognition00.342021
PROSODY AND VOICE FACTORIZATION FOR FEW-SHOT SPEAKER ADAPTATION IN THE CHALLENGE M2VOC 202100.342021
BI-LEVEL STYLE AND PROSODY DECOUPLING MODELING FOR PERSONALIZED END-TO-END SPEECH SYNTHESIS00.342021
Hierarchically Attending Time-Frequency and Channel Features for Improving Speaker Verification00.342021
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning00.342021
Half-Truth - A Partially Fake Audio Detection Dataset.00.342021
DECOUPLING PRONUNCIATION AND LANGUAGE FOR END-TO-END CODE-SWITCHING AUTOMATIC SPEECH RECOGNITION00.342021
Text Enhancement for Paragraph Processing in End-to-End Code-switching TTS00.342021
FSR - Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization.00.342021
PATNET : A PHONEME-LEVEL AUTOREGRESSIVE TRANSFORMER NETWORK FOR SPEECH SYNTHESIS10.372021
Gated Recurrent Fusion With Joint Training Framework for Robust End-to-End Speech Recognition00.342021
Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition00.342020
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding.00.342020
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis.00.342020
Focal Loss for Punctuation Prediction.00.342020
Dynamic Speaker Representations Adjustment and Decoder Factorization for Speaker Adaptation in End-to-End Speech Synthesis.00.342020
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition00.342020
Gated Recurrent Fusion of Spatial and Spectral Features for Multi-Channel Speech Separation with Deep Embedding Representations.00.342020
Bi-Level Speaker Supervision for One-Shot Speech Synthesis.00.342020
Joint Training for Simultaneous Speech Denoising and Dereverberation with Deep Embedding Representations.00.342020
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation.00.342020
A Public Chinese Dataset for Language Model Adaptation00.342020
Self-Attention Transducers for End-to-End Speech Recognition10.362019
A Time Delay Neural Network with Shared Weight Self-Attention for Small-Footprint Keyword Spotting00.342019
Self-Attention Based Model For Punctuation Prediction Using Word And Speech Embeddings00.342019
Language-Invariant Bottleneck Features From Adversarial End-To-End Acoustic Models For Low Resource Speech Recognition00.342019
Forward–Backward Decoding Sequence for Regularizing End-to-End TTS10.372019
Language-Adversarial Transfer Learning for Low-Resource Speech Recognition.30.392019
Voice Activity Detection Based on Time-Delay Neural Networks10.352019
Distilling Knowledge for Distant Speech Recognition via Parallel Data00.342019
Batch Normalization based Unsupervised Speaker Adaptation for Acoustic Models00.342019
Focal Loss for End-to-end Short Utterances Chinese Dialect Identification00.342019
Noise Prior Knowledge Learning for Speech Enhancement via Gated Convolutional Generative Adversarial Network00.342019
Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features20.402019
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition10.342019
Hypersphere Embedding and Additive Margin for Query-by-example Keyword Spotting00.342019
CLMAD: A Chinese Language Model Adaptation Dataset00.342018
Research on Dynamic and Static Fusion Polymorphic Gesture Recognition Algorithm for Interactive Teaching Interface.00.342018
Utterance-level Permutation Invariant Training with Discriminative Learning for Single Channel Speech Separation00.342018
Distilling Knowledge Using Parallel Data for Far-field Speech Recognition.00.342018
CTC regularized model adaptation for improving LSTM RNN based multi-accent Mandarin speech recognition.10.352018
Distilling Knowledge From An Ensemble Of Models For Punctuation Prediction00.342017
Continuous Multimodal Emotion Prediction Based on Long Short Term Memory Recurrent Neural Network.70.482017
Improving BLSTM RNN based Mandarin speech recognition using accent dependent bottleneck features.00.342016
Improving accented Mandarin speech recognition by using recurrent neural network based language model adaptation00.342016
  • 1
  • 2