Abstract | ||
---|---|---|
We propose an audiovisual source separation algorithm for speech signals. In our proposed algorithm we first extract the time segments with low activity of the mouth region from synchronous video recordings. An automatically selected optimal classifier is used to detect silent intervals in these instants of low visual mouth activity. Then, the source separation problem is formulated and solved for the entire signal duration. Our approach was tested on two challenging speech corpora with two speakers and two microphones, namely in the first corpus separate source signals were mixed in a simulated room, and the second corpus contains recorded conversations. The results are promising on both corpora: with the visual silence detector the performance of the source separation algorithm, measured by the signal to noise inference ratio increases. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/ICIG.2009.146 | Xi'an, Shanxi |
Keywords | Field | DocType |
low activity,visual silence detector constraining,entire signal duration,source separation algorithm,audiovisual source separation algorithm,mouth region,challenging speech corpus,source separation problem,corpus separate source signal,speech source separation,proposed algorithm,low visual mouth activity,frequency domain analysis,visualization,feature extraction,speech,speech processing,tv | Frequency domain,Speech processing,Pattern recognition,Visualization,Computer science,Signal-to-noise ratio,Feature extraction,Speech recognition,Artificial intelligence,Classifier (linguistics),Detector,Source separation | Conference |
ISBN | Citations | PageRank |
978-1-4244-5237-8 | 1 | 0.34 |
References | Authors | |
9 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Isabel Gonzalez | 1 | 54 | 5.10 |
Ravyse Ilse | 2 | 43 | 6.24 |
Henk Brouckxon | 3 | 11 | 1.31 |
Werner Verhelst | 4 | 431 | 51.55 |
Jiang Dongmei | 5 | 115 | 15.28 |
Hichem Sahli | 6 | 475 | 65.19 |