Abstract | ||
---|---|---|
ABSTRACTThe task of Mask-Speech Identification (MSI) aims at judging whether a chunk of speech is pronounced when the speaker is wearing a facial mask or not. Most of the existing related research focuses on investigating the influence of wearing a mask, which only adapts in some certain cases to speech analysis. Thus in order to generalise the research on MSI, we propose an MSI approach using deep networks on Low-Level Aggregation (LLA) for speech chunks. The proposed approach benefits from data augmentation on Low-Level Descriptors (LLDs), resulting in more adaptation to deep models through inputting much more samples in training without employing pre-trained knowledge. Experiments are performed on the dataset of Mask Augsburg Speech Corpus (MSC) used in the INTERSPEECH 2020 ComParE challenge, considering the influence from employing different strategies. The experimental results show effectiveness of the proposed approach compared with the ComParE challenge baselines. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3412841.3441938 | Symposium on Applied Computing |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xinzhou Xu | 1 | 0 | 0.34 |
Jun Deng | 2 | 278 | 18.59 |
Zixing Zhang | 3 | 397 | 31.73 |
Chen Wu | 4 | 7 | 1.13 |
Björn Schuller | 5 | 6749 | 463.50 |