Title
Multimodal Attention Network for Trauma Activity Recognition from Spoken Language and Environmental Sound
Abstract
Trauma activity recognition aims to detect, recognize, and predict the activities (or tasks) during trauma resuscitation. Previous work has mainly focused on using various sensor data including image, RFID, and vital signals to generate the trauma event log. However, spoken language and environmental sound, which contain rich communication and contextual information necessary for trauma team cooperation, are still largely ignored. In this paper, we propose a multimodal attention network (MAN) that uses both verbal transcripts and environmental audio stream as input; the model extracts textual and acoustic features using a multi-level multi-head attention module, and forms a final shared representation for trauma activity classification. We evaluated the proposed architecture on 75 actual trauma resuscitation cases collected from a hospital. We achieved 71.8% accuracy with 0.702 F1 score, demonstrating that our proposed architecture is useful and efficient. These results also show that using spoken language and environmental audio indeed helps identify hard-to-recognize activities, compared to previous approaches. We also provide a detailed analysis of the performance and generalization of the proposed multimodal attention network.
Year
DOI
Venue
2019
10.1109/ICHI.2019.8904713
2019 IEEE International Conference on Healthcare Informatics (ICHI)
Keywords
Field
DocType
trauma activity recognition,spoken language,environmental sound,multimodal attention network
Activity recognition,Communication,Psychology,Spoken language
Conference
ISSN
ISBN
Citations 
2575-2626
978-1-5386-9139-7
0
PageRank 
References 
Authors
0.34
7
8
Name
Order
Citations
PageRank
Yue Gu1396.08
Ruiyu Zhang200.34
Xinwei Zhao300.34
Shuhong Chen44910.21
Jalal Abdulbaqi501.35
Ivan Marsic671691.96
Megan Cheng711.42
Randall S. Burd812221.53