Abstract | ||
---|---|---|
This paper addresses issues in situated language understanding in a moving car. We propose a reference resolution method to identify user queries about specific target objects in their surroundings. We investigate methods of predicting which target object is likely to be queried given a visual scene and what kind of linguistic cues users naturally provide to describe a given target object in a situated environment. We propose methods to incorporate the visual saliency of the visual scene as a prior. Crowdsourced statistics of how people describe an object are also used as a prior. We have collected situated utterances from drivers using our research system, which was embedded in a real vehicle. We demonstrate that the proposed algorithms improve target identification rate by 15.1%. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1145/2818346.2820748 | ICMI |
Keywords | Field | DocType |
Situated dialog, In-car interaction, Visual saliency, Crowdsourcing, Multimodal interaction | Situated,Computer vision,Multimodal interaction,Computer science,Crowdsourcing,Human–computer interaction,Artificial intelligence,Dialog system,Prior probability,Language understanding,Visual saliency | Conference |
Citations | PageRank | References |
2 | 0.38 | 18 |
Authors | ||
1 |
Name | Order | Citations | PageRank |
---|---|---|---|
Teruhisa Misu | 1 | 19 | 5.89 |