Title
Performance Analysis Of Distributed Speech Recognition Using Analysis-By-Synthesis Frame Reduced Front End Under Packet Loss Conditions
Abstract
We proposed an analysis-by-synthesis (AbS) frame dropping algorithm for the front end of a distributed speech recognition (DSR) system that preserves rapidly changing frames for being more related to speech perception but discards slowly changing frames for providing little information. When applying DSR over error prone packet-switched networks, speech data will inevitably suffer from frame loss since packets may be lost or delayed due to congestion at routers. We further employed a model adaptation error concealment decoder at the back-end for compensating the mismatch between the pre-trained models and the test data, which contain missing frames caused by frame dropping at the front end and packet loss over the transmitted channel. This approach, for convenience, is denoted as AbS-MA. In the decoding process of AbS-MA, the transition probabilities of the hidden Markov models are dynamically adapted according to the time difference between successive observations. Experiments on the recognition of Mandarin digits were conducted to investigate the effectiveness of the proposed AbSMA method for a wide range of combinations of frame rates and packet loss conditions. The performance of the proposed AbSMA approach was compared with a baseline approach, in which the error concealment was implemented by an interpolation as the estimate of the missing frame of the received observations at the back-end. The experimental results show that AbS-MA is not only superior to the baseline in word accuracy but also significantly reduces the computation time.
Year
DOI
Venue
2015
10.1109/SMC.2015.347
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS
Keywords
Field
DocType
Distributed speech recognition (DSR), hidden Marko), model (HMM), variable frame rate (VFR), frame dropping, packet loss
Front and back ends,Speech coding,Markov model,Voice activity detection,Computer science,Network packet,Packet loss,Speech recognition,Frame rate,Hidden Markov model
Conference
ISSN
Citations 
PageRank 
1062-922X
0
0.34
References 
Authors
11
3
Name
Order
Citations
PageRank
Lee-Min Lee1468.10
Fu-Rong Jean2369.04
Tan-Hsu Tan32110.28