Deep Boltzmann Machines for i-Vector Based Audio-Visual Person Identification. - Citegraph

Paper Info

Title
Deep Boltzmann Machines for i-Vector Based Audio-Visual Person Identification.

Abstract
We propose an approach using DBM-DNNs for i-vector based audio-visual person identification. The unsupervised training of two Deep Boltzmann Machines DBM$$_{\\text {speech}}$$ and DBM$$_\\text {face}$$ is performed using unlabeled audio and visual data from a set of background subjects. The DBMs are then used to initialize two corresponding DNNs for classification, referred to as the DBM-DNN$$_{\\text {speech}}$$ and DBM-DNN$$_{\\text {face}}$$ in this paper. The DBM-DNNs are discriminatively fine-tuned using the back-propagation on a set of training data and evaluated on a set of test data from the target subjects. We compared their performance with the cosine distance cosDist and the state-of-the-art DBN-DNN classifier. We also tested three different configurations of the DBM-DNNs. We show that DBM-DNNs with two hidden layers and 800 units in each hidden layer achieved best identification performance for 400 dimensional i-vectors as input. Our experiments were carried out on the challenging MOBIO dataset.

Year	DOI	Venue
2015	10.1007/978-3-319-29451-3_50	PSIVT
Field	DocType	Volume
I vector,Boltzmann machine,Pattern recognition,Computer science,Cosine Distance,Deep belief network,Speaker recognition,Test data,Artificial intelligence,Boltzmann constant,Classifier (linguistics)	Conference	9431
ISSN	Citations	PageRank
0302-9743	2	0.36
References	Authors
19	4

Authors (4 rows)

Cited by (2 rows)

References (19 rows)

Name	Order	Citations	PageRank
Mohammad Rafiqul Alam	1	8	2.54
M. Bennamoun	2	3197	167.23
Roberto Togneri	3	814	48.33
Ferdous Ahmed Sohel	4	623	31.78

1