Title
Predicting the Understandability of Imperfect English Captions for People Who Are Deaf or Hard of Hearing
Abstract
Automatic Speech Recognition (ASR) technology has seen major advancements in its accuracy and speed in recent years, making it a possible mechanism for supporting communication between people who are Deaf or Hard-of-Hearing (DHH) and their hearing peers. However, state-of-the-art ASR technology is still imperfect in many realistic settings. Researchers who evaluate ASR performance often focus on improving the Word Error Rate (WER) metric, but it has been found to have little correlation with human-subject performance for many applications. This article describes and evaluates several new captioning-focused evaluation metrics for predicting the impact of ASR errors on the understandability of automatically generated captions for people who are DHH. Through experimental studies with DHH users, we have found that our new metric (based on word-importance and semantic-difference scoring) is more closely correlated with DHH user's judgements of caption quality—as compared to pre-existing metrics for ASR evaluation.
Year
DOI
Venue
2019
10.1145/3325862
ACM Transactions on Accessible Computing (TACCESS)
Keywords
Field
DocType
Accessibility for people who are deaf or hard-of-hearing, automatic speech recognition, caption understandability evaluation, real-time captioning system
Imperfect,Computer science,Word error rate,Speech recognition,Human–computer interaction
Journal
Volume
Issue
ISSN
12
2
1936-7228
Citations 
PageRank 
References 
0
0.34
0
Authors
2
Name
Order
Citations
PageRank
Sushant Kafle1104.03
Matt Huenerfauth242851.83