Abstract | ||
---|---|---|
Spoken language identification is the process by which the language in a spoken utterance is recognized automatically. Spoken language identification is commonly used in speech translation systems, in multi-lingual speech recognition, and in speaker diarization. In the current paper, spoken language identification based on deep learning (DL) and the i-vector paradigm is presented. Specifically, a comparative study is reported, consisting of experiments on language identification using deep neural networks (DNN) and convolutional neural networks (CNN). Also, the integration of the two methods into a complete system is investigated. Previous studies demonstrated the effectiveness of using DNN in spoken language identification. However, to date, the integration of CNN and i-vectors in language identification has not been investigated. The main advantage of using CNN is that fewer parameters are required compared to DNN. As a result, CNN is cheaper in terms of memory and the computational power needed. The proposed methods are evaluated on the NIST 2015 i-vector Machine Learning Challenge task for the recognition of 50 in-set languages. Using DNN, a 3.55% equal error rate (EER) was achieved. The EER when using CNN was 3.48%. When DNN and CNN systems were fused, an EER of 3.3% was obtained. The results are very promising, and they also show the effectiveness of using CNN and i-vectors in spoken language identification. The proposed methods are compared to a baseline method based on support vector machines (SVM) and they demonstrated significantly superior performance. |
Year | DOI | Venue |
---|---|---|
2018 | 10.23919/EUSIPCO.2018.8553347 | European Signal Processing Conference |
Field | DocType | ISSN |
Convolutional neural network,Computer science,Word error rate,Support vector machine,Utterance,Speech recognition,Language identification,Speaker diarisation,Artificial intelligence,Deep learning,Speech translation | Conference | 2076-1465 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Panikos Heracleous | 1 | 68 | 16.27 |
Kohichi Takai | 2 | 0 | 0.68 |
Keiji Yasuda | 3 | 85 | 16.50 |
Yasser Mohammad | 4 | 0 | 0.34 |
Akio Yoneyama | 5 | 117 | 17.49 |