Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora - Citegraph

Paper Info

Title
Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora

Abstract
Identifying a speaker's native language (L1), i.e., mother tongue, based upon non-native English (L2) speech input, is both challenging and useful for many human-machine voice interface applications, e.g., computer assisted language learning (CALL). In this paper, we improve our sub-phone TDNN based i-vector approach to L1 recognition with a more accurate TDNN-derived VAD and a highly discriminative classifier. Two TDNNs are separately trained on native and non-native English, LVCSR corpora, for contrasting their corresponding sub-phone posteriors and resultant supervectors. The derived i-vectors are then exploited for improving the performance further. Experimental results on a database of 25 L1s show a 3.1% identification rate improvement, from 78.7% to 81.8%, compared with a high performance baseline system which has already achieved the best published results on the 2016 ComParE corpus of only 11 L1s. The statistical analysis of the features used in our system provides useful findings, e.g. pronunciation similarity among the non-native English speakers with different L1s, for research on second-language (L2) learning and assessment.

Year	DOI	Venue
2017	10.1109/ASRU.2017.8268992	2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords	DocType	ISBN
native language identification,i-vector,time delay deep neural networks (TDNN)	Conference	978-1-5090-4789-5
Citations	PageRank	References
0	0.34	0
Authors
6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Qian Yao	1	527	51.55
Keelan Evanini	2	79	20.23
Patrick Lange	3	9	8.42
Robert A. Pugh	4	0	0.68
Rutuja Ubale	5	2	3.17
Frank K. Soong	6	1395	268.29

1