Title
Native Language Identification from Raw Waveforms Using Deep Convolutional Neural Networks with Attentive Pooling
Abstract
Automatic detection of an individual's native language (L1) based on speech data from their second language (L2) can be useful for informing a variety of speech applications such as automatic speech recognition (ASR), speaker recognition, voice biometrics, and computer assisted language learning (CALL). Previously proposed systems for native language identification from L2 acoustic signals rely on traditional feature extraction pipelines to extract relevant features such as mel-filterbanks, cepstral coefficients, i-vectors, etc. In this paper, we present a fully convolutional neural network approach that is trained end-to-end to predict the native language of the speaker directly from the raw waveforms, thereby removing the feature extraction step altogether. Experimental results using this approach on a database of 11 different L1s suggest that the learnable convolutional layers of our proposed attention-based end-to-end model extract meaningful features from raw waveforms. Further, the attentive pooling mechanism in our proposed network enables our model to focus on the most discriminative features leading to improvements over the conventional baseline.
Year
DOI
Venue
2019
10.1109/ASRU46091.2019.9003872
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords
DocType
ISBN
end-to-end learning,native language identification,raw waveform processing,deep convolutional networks,attentive pooling
Conference
978-1-7281-0307-5
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Rutuja Ubale123.17
Vikram Ramanarayanan27013.97
Qian Yao352751.55
Keelan Evanini47920.23
Chee Wee Leong515315.10
Chong Min Lee6496.84