Title
Deep Boltzmann Machines for i-Vector Based Audio-Visual Person Identification.
Abstract
We propose an approach using DBM-DNNs for i-vector based audio-visual person identification. The unsupervised training of two Deep Boltzmann Machines DBM$$_{\\text {speech}}$$ and DBM$$_\\text {face}$$ is performed using unlabeled audio and visual data from a set of background subjects. The DBMs are then used to initialize two corresponding DNNs for classification, referred to as the DBM-DNN$$_{\\text {speech}}$$ and DBM-DNN$$_{\\text {face}}$$ in this paper. The DBM-DNNs are discriminatively fine-tuned using the back-propagation on a set of training data and evaluated on a set of test data from the target subjects. We compared their performance with the cosine distance cosDist and the state-of-the-art DBN-DNN classifier. We also tested three different configurations of the DBM-DNNs. We show that DBM-DNNs with two hidden layers and 800 units in each hidden layer achieved best identification performance for 400 dimensional i-vectors as input. Our experiments were carried out on the challenging MOBIO dataset.
Year
DOI
Venue
2015
10.1007/978-3-319-29451-3_50
PSIVT
Field
DocType
Volume
I vector,Boltzmann machine,Pattern recognition,Computer science,Cosine Distance,Deep belief network,Speaker recognition,Test data,Artificial intelligence,Boltzmann constant,Classifier (linguistics)
Conference
9431
ISSN
Citations 
PageRank 
0302-9743
2
0.36
References 
Authors
19
4
Name
Order
Citations
PageRank
Mohammad Rafiqul Alam182.54
M. Bennamoun23197167.23
Roberto Togneri381448.33
Ferdous Ahmed Sohel462331.78