Title
TEO-based speaker stress assessment using hybrid classification and tracking schemes
Abstract
Speaker variability is known to have an adverse impact on speech systems that process linguistic content, such as speech and language recognition. However, speech production changes in individuals due to stress and emotions have similarly detrimental effect also on the task of speaker recognition as they introduce mismatch with the speaker models typically trained on modal speech. The focus of this study is on the analysis of stress-induced variations in speech and design of an automatic stress level assessment scheme that could be used in directing stress-dependent acoustic models or normalization strategies. Current stress detection methods typically employ a binary decision based on whether the speaker is or not under stress. In reality, the amount of stress in individuals varies and can change gradually. Using speech and biometric data collected in a real-world, variable-stress level law enforcement training scenario, this study considers two methods for stress level assessment. The first approach uses a nearest neighbor clustering scheme at the vowel token and sentence levels to classify speech data into three levels of stress. The second approach employs Euclidean distance metrics within the multi-dimensional feature space to provide real-time stress level tracking capability. Evaluations on audio data confirmed by biometric readings show both methods to be effective in assessment of stress level within a speaker (average accuracy of 55.6 % in a 3-way classification task). In addition, an impact of high-level stress on in-set speaker recognition is evaluated and shown to reduce the accuracy from 91.7 % (low/mid stress) to 21.4 % (high level stress).
Year
DOI
Venue
2012
10.1007/s10772-012-9165-1
I. J. Speech Technology
Keywords
Field
DocType
Stress assessment from speech, FLETC Corpus, TEO operator
Feature vector,Normalization (statistics),Pattern recognition,Computer science,Euclidean distance,Speech recognition,Speaker recognition,Speaker diarisation,Artificial intelligence,Biometrics,Cluster analysis,Speech production
Journal
Volume
Issue
ISSN
15
3
1381-2416
Citations 
PageRank 
References 
3
0.38
13
Authors
4
Name
Order
Citations
PageRank
John H. L. Hansen13215365.75
Evan Ruzanski2213.89
Hynek Boril341.06
James Meyerhoff4383.88