Analysis and Tuning of a Voice Assistant System for Dysfluent Speech. - Citegraph

Paper Info

Title
Analysis and Tuning of a Voice Assistant System for Dysfluent Speech.

Abstract
Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work. Current speech recognition systems are trained primarily with data from fluent speakers and as a consequence do not generalize well to speech with dysfluencies such as sound or word repetitions, sound prolongations, or audible blocks. The focus of this work is on quantitative analysis of a consumer speech recognition system on individuals who stutter and production-oriented approaches for improving performance for common voice assistant tasks (i.e., "what is the weather?"). At baseline, this system introduces a significant number of insertion and substitution errors resulting in intended speech Word Error Rates (isWER) that are 13.64\% worse (absolute) for individuals with fluency disorders. We show that by simply tuning the decoding parameters in an existing hybrid speech recognition system one can improve isWER by 24\% (relative) for individuals with fluency disorders. Tuning these parameters translates to 3.6\% better domain recognition and 1.7\% better intent recognition relative to the default setup for the 18 study participants across all stuttering severities.

Year	DOI	Venue
2021	10.21437/Interspeech.2021-2006	Interspeech
DocType	Citations	PageRank
Conference	1	0.40
References	Authors
0	11

Authors (11 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Vikramjit Mitra	1	299	24.83
Zifang Huang	2	1	0.73
Colin S. Lea	3	3	2.13
Lauren Tooley	4	1	0.40
Sarah Wu	5	2	0.80
Darren Botten	6	1	0.40
Ashwini Palekar	7	1	0.40
Shrinath Thelapurath	8	1	0.40
Panayiotis Georgiou	9	1	0.40
Sachin Kajarekar	10	1	2.09
jeffrey p bigham	11	2647	189.29

1