Title
Leveraging transfer learning techniques for classifying infant vocalizations
Abstract
Infant vocalizations serve various communicative functions and are related to several developmental factors. Different types of vocalizations depict distinct spectro-temporal patterns, which can be recovered and learned using emerging end-to-end machine learning systems. A common problem in such systems is the limited availability of labelled data preventing reliable training. Transfer learning can be used to mitigate this problem by taking advantage of additional data resources relevant to the problem of interest. We propose a transfer learning framework which relies on neural network fine-tuning, and explore various types of architectures, such as a convolutional neural network (CNN) and long-term-short-memory (LSTM) recurrent neural networks with and without an attention mechanism. Our target data come from the Cry Recognition In Early Development (CRIED), while the source data come from three publicly available resources: the Oxford Vocal (OxVoc) Sounds database, the Google AudioSet, and the Freesound repository. Our results indicate that the neural network architectures trained with the proposed transfer learning approach outperform the corresponding networks solely trained on the target data, as well as neural networks pre-trained on large-scale image datasets and adapted to the target data (e.g., VGG16). These suggest the effectiveness of adaptation techniques combined with appropriate publicly available datasets for mitigating the limited availability of labelled data in human-related applications.
Year
DOI
Venue
2019
10.1109/BHI.2019.8834666
2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI)
Keywords
Field
DocType
Infant vocalization,transfer learning,neural network fine-tuning,Google AudioSet,OxVoc Sounds
Source data,Convolutional neural network,Data resources,Computer science,Transfer of learning,Recurrent neural network,Artificial intelligence,Artificial neural network,Limited availability,Machine learning,Infant Vocalization
Conference
ISSN
ISBN
Citations 
2641-3590
978-1-7281-0849-0
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Aditya Gujral110.69
Kexin Feng223.53
Gulshan Mandhyan300.34
Nfn Snehil400.34
Theodora Chaspari53819.43