Detection and localization of 3d audio-visual objects using unsupervised clustering - Citegraph

Paper Info

Title
Detection and localization of 3d audio-visual objects using unsupervised clustering

Abstract
This paper addresses the issues of detecting and localizing objects in a scene that are both seen and heard. We explain the benefits of a human-like configuration of sensors (binaural and binocular) for gathering auditory and visual observations. It is shown that the detection and localization problem can be recast as the task of clustering the audio-visual observations into coherent groups. We propose a probabilistic generative model that captures the relations between audio and visual observations. This model maps the data into a common audio-visual 3D representation via a pair of mixture models. Inference is performed by a version of the expectation-maximization algorithm, which is formally derived, and which provides cooperative estimates of both the auditory activity and the 3D position of each object. We describe several experiments with single- and multiple-speaker detection and localization, in the presence of other audio sources.

Year	DOI	Venue
2008	10.1145/1452392.1452438	ICMI
Keywords	Field	DocType
probabilistic generative model,coherent group,localization problem,model map,auditory activity,audio-visual observation,mixture model,unsupervised clustering,visual observation,audio-visual object,multiple-speaker detection,audio source,expectation maximization algorithm,mixture models,binaural hearing,stereo vision	Computer vision,Visual Objects,Pattern recognition,Stereopsis,Computer science,Inference,Probabilistic generative model,Sound localization,Artificial intelligence,Binaural recording,Cluster analysis,Mixture model	Conference
Citations	PageRank	References
6	0.54	16
Authors
5

Authors (5 rows)

Cited by (6 rows)

References (16 rows)

Name	Order	Citations	PageRank
Vasil Khalidov	1	74	7.35
Florence Forbes	2	115	9.87
Miles Hansard	3	46	4.34
Elise Arnaud	4	126	10.05
Radu Horaud	5	2776	261.99

1