Abstract | ||
---|---|---|
Many users obtain content from a screen and want to make requests of a system based on items that they have seen. Eye-gaze information is a valuable signal in speech recognition and spoken-language understanding (SLU) because it provides context for a user's next utterance-what the user says next is probably conditioned on what they have seen. This paper investigates three types of features for connecting eye-gaze information to an SLU system: lexical, and two types of eye-gaze features. These features help us to understand which object (i.e. a link) that a user is referring to on a screen. We show a 17% absolute performance improvement in the referenced-object F-score by adding eye-gaze features to conventional methods based on a lexical comparison of the spoken utterance and the text on the screen. |
Year | Venue | Keywords |
---|---|---|
2015 | 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | Spoken language understanding, referring expression resolution, eye gaze, heat maps, classification |
Field | DocType | ISSN |
Computer science,Utterance,Speech recognition,Eye tracking,Natural language processing,Artificial intelligence,Probabilistic logic,Spoken language,Performance improvement | Conference | 1520-6149 |
Citations | PageRank | References |
2 | 0.37 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Anna Prokofieva | 1 | 2 | 0.37 |
Malcolm Slaney | 2 | 1797 | 212.76 |
Dilek Hakkani-Tür | 3 | 282 | 17.30 |