Separating three simultaneous speeches with two microphones by integrating auditory and visual processing - Citegraph

Paper Info

Title
Separating three simultaneous speeches with two microphones by integrating auditory and visual processing

Abstract
This paper addresses the problem of automatic recognition of three simultaneous speeches with two microphones, that is, that of sound source separation where the number of sound sources is greater than that of microphones. The approach used is the direction-pass filter, which is implemented by hypothet- ical reasoning on the interaural phase difference (IPD) and in- teraural intensity difference (IID). Auditory processing calcu- lates IPD and IID for each subband, and generates hypotheses for precalculated IPD and IID for every direction including one obtained by visual processing. Then the system calculates the belief factor of hypothesis by Dempster-Shafer theory and de- termines the direction of each subband. Subbands of the spe- cific direction are collected and then converted to a wave form by inverse FFT. With 200 benchmarks of three simultaneous utterances of Japanese words, the average 1-best and 10-best recognition rates of extracted speeches are 60% and 81%, re- spectively.

Year	Venue	Keywords
2001	INTERSPEECH	dempster shafer theory
Field	DocType	Citations
Sound source separation,Visual processing,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Inverse fft	Conference	6
PageRank	References	Authors
0.77	7	4

Authors (4 rows)

Cited by (6 rows)

References (7 rows)

Name	Order	Citations	PageRank
Hiroshi G. Okuno	1	2092	233.19
Kazuhiro Nakadai	2	1342	155.91
Tino Lourens	3	304	34.25
Hiroaki Kitano	4	3515	539.37

1