Audio-Visual Kinship Verification in the Wild - Citegraph

Paper Info

Title
Audio-Visual Kinship Verification in the Wild

Abstract
Kinship verification is a challenging problem, where recognition systems are trained to establish a kin relation between two individuals based on facial images or videos. However, due to variations in capture conditions (background, pose, expression, illumination and occlusion), state-of-the-art systems currently provide a low level of accuracy. As in many visual recognition and affective computing applications, kinship verification may benefit from a combination of discriminant information extracted from both video and audio signals. In this paper, we investigate for the first time the fusion audio-visual information from both face and voice modalities to improve kinship verification accuracy. First, we propose a new multi-modal kinship dataset called TALking KINship (TALKIN), that is comprised of several pairs of video sequences with subjects talking. State-of-the-art conventional and deep learning models are assessed and compared for kinship verification using this dataset. Finally, we propose a deep Siamese network for multi-modal fusion of kinship relations. Experiments with the TALKIN dataset indicate that the proposed Siamese network provides a significantly higher level of accuracy over baseline uni-modal and multi-modal fusion techniques for kinship verification. Results also indicate that audio (vocal) information is complementary and useful for kinship verification problem.

Year	DOI	Venue
2019	10.1109/ICB45273.2019.8987241	2019 International Conference on Biometrics (ICB)
Keywords	Field	DocType
multimodal kinship dataset,TALking KINship,deep learning models,kinship relations,multimodal fusion techniques,audio-visual kinship verification,facial images,visual recognition,audio signals,fusion audio-visual information,TALKIN dataset,discriminant information extraction,video signals,video sequences,Siamese network	Modalities,Audio signal,Pattern recognition,Computer science,Kinship,Verification problem,Speech recognition,Visual recognition,Artificial intelligence,Deep learning,Affective computing	Conference
ISSN	ISBN	Citations
2376-4201	978-1-7281-3641-7	2
PageRank	References	Authors
0.37	0	5

Authors (5 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiaoting Wu	1	2	0.37
Eric Granger	2	168	17.40
Tomi Kinnunen	3	1323	86.67
Xiaoyi Feng	4	229	38.15
Abdenour Hadid	5	3305	146.00

1