Title | ||
---|---|---|
Automatic detection of speaker state: Lexical, prosodic, and phonetic approaches to level-of-interest and intoxication classification |
Abstract | ||
---|---|---|
Traditional studies of speaker state focus primarily upon one-stage classification techniques using standard acoustic features. In this article, we investigate multiple novel features and approaches to two recent tasks in speaker state detection: level-of-interest (LOI) detection and intoxication detection. In the task of LOI prediction, we propose a novel Discriminative TFIDF feature to capture important lexical information and a novel Prosodic Event detection approach using AuToBI; we combine these with acoustic features for this task using a new multilevel multistream prediction feedback and similarity-based hierarchical fusion learning approach. Our experimental results outperform published results of all systems in the 2010 Interspeech Paralinguistic Challenge - Affect Subchallenge. In the intoxication detection task, we evaluate the performance of Prosodic Event-based, phone duration-based, phonotactic, and phonetic-spectral based approaches, finding that a combination of the phonotactic and phonetic-spectral approaches achieve significant improvement over the 2011 Interspeech Speaker State Challenge - Intoxication Subchallenge baseline. We discuss our results using these new features and approaches and their implications for future research. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1016/j.csl.2012.03.004 | Computer Speech & Language |
Keywords | Field | DocType |
intoxication detection task,automatic detection,novel discriminative tfidf feature,intoxication classification,interspeech paralinguistic challenge,recent task,interspeech speaker state challenge,affect subchallenge,multiple novel feature,novel prosodic event detection,phonetic approach,speaker state detection,intoxication detection,paralinguistic | Phonotactics,Paralanguage,tf–idf,Computer science,Speech recognition,Phone,Natural language processing,Artificial intelligence,Speaker diarisation,Discriminative model | Journal |
Volume | Issue | ISSN |
27 | 1 | 0885-2308 |
Citations | PageRank | References |
1 | 0.37 | 31 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
William Yang Wang | 1 | 493 | 59.64 |
Fadi Biadsy | 2 | 207 | 15.14 |
Andrew Rosenberg | 3 | 422 | 24.67 |
Julia Hirschberg | 4 | 2982 | 448.62 |