DEPA: Self-Supervised Audio Embedding for Depression Detection - Citegraph

Paper Info

Title
DEPA: Self-Supervised Audio Embedding for Depression Detection

Abstract
ABSTRACTDepression detection research has increased over the last few decades, one major bottleneck of which is the limited data availability and representation learning. Recently, self-supervised learning has seen success in pretraining text embeddings and has been applied broadly on related tasks with sparse data, while pretrained audio embeddings based on self-supervised learning are rarely investigated. This paper proposes DEPA, a self-supervised, pretrained dep ression a udio embedding method for depression detection. An encoder-decoder network is used to extract DEPA on in-domain depressed datasets (DAIC and MDD) and out-domain (Switchboard, Alzheimer's) datasets. With DEPA as the audio embedding extracted at response-level, a significant performance gain is achieved on downstream tasks, evaluated on both sparse datasets like DAIC and large major depression disorder dataset (MDD). This paper not only exhibits itself as a novel embedding extracting method capturing response-level representation for depression detection but more significantly, is an exploration of self-supervised learning in a specific task within audio processing.

Year	DOI	Venue
2021	10.1145/3474085.3479236	International Multimedia Conference
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Pingyue Zhang	1	0	0.34
Mengyue Wu	2	0	4.73
Heinrich Dinkel	3	23	5.79
Kai Yu	4	1082	90.58

1