Title
Multimodal analysis of vocal collaborative search: a public corpus and results.
Abstract
Intelligent agents have the potential to help with many tasks. Information seeking and voice-enabled search assistants are becoming very common. However, there remain questions as to the extent by which these agents should sense and respond to emotional signals. We designed a set of information seeking tasks and recruited participants to complete them using a human intermediary. In total we collected data from 22 pairs of individuals, each completing five search tasks. The participants could communicate only using voice, over a VoIP service. Using automated methods we extracted facial action, voice prosody and linguistic features from the audio-visual recordings. We analyzed the characteristics of these interactions that correlated with successful communication and understanding between the pairs. We found that those who were expressive in channels that were missing from the communication channel (e.g., facial actions and gaze) were rated as communicating poorly, being less helpful and understanding. Having a way of reinstating nonverbal cues into these interactions would improve the experience, even when the tasks are purely information seeking exercises. The dataset used for this analysis contains over 15 hours of video, audio and transcripts and reported ratings. It is publicly available for researchers at: http://aka.ms/MISCv1.
Year
Venue
Field
2017
ICMI
Prosody,Intelligent agent,Gaze,Computer science,Information seeking,Nonverbal communication,Multimodal analysis,Human–computer interaction,Voice over IP,Sense and respond
DocType
ISBN
Citations 
Conference
978-1-4503-5543-8
0
PageRank 
References 
Authors
0.34
11
4
Name
Order
Citations
PageRank
Daniel J McDuff167261.67
Paul Thomas270551.00
Mary Czerwinski35028421.65
Nick Craswell43942279.60