Title
Dynamic Dependency Tests for Audio-Visual Speaker Association
Abstract
We formulate the problem of audio-visual speaker association as a dynamic dependency test. That is, given an audio stream and multiple video streams, we wish to determine their dependency structure as it evolves over time. To this end, we propose the use of a hidden factorization Markov model in which the hidden state encodes a finite number of possible dependency structures. Each dependency structure has an explicit semantic meaning, namely "who is speaking". This model takes advantage of both structural and parametric changes associated with changes in speaker. This is contrasted with standard sliding window based dependence analysis. Using this model we obtain state-of-the-art performance on an audio-visual association task without benefit of training data.
Year
DOI
Venue
2007
10.1109/ICASSP.2007.366271
Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference
Keywords
Field
DocType
Markov processes,audio signal processing,speaker recognition,video signal processing,audio stream,audio-visual speaker association,dynamic dependency tests,hidden factorization Markov model,multiple video streams,Pattern clustering methods
Markov process,Sliding window protocol,Multivalued dependency,Pattern recognition,Markov model,Computer science,Join dependency,Speech recognition,Speaker recognition,Artificial intelligence,Audio signal processing,Hidden Markov model
Conference
Volume
ISSN
ISBN
2
1520-6149
1-4244-0727-3
Citations 
PageRank 
References 
5
0.51
7
Authors
3
Name
Order
Citations
PageRank
Michael R. Siracusa1131.62
John W. Fisher III287874.44
Fisher, J.W.350.51