Multimodal Learning for Situated Language Understanding


Motivations and Objectives

Using situated dialogue (in the virtual world) and conversational interfaces as our setting, we have investigated the use of non-verbal modalities (e.g., eye gaze and deictic gestures) in language processing and in conversation grounding. The virtual world setting not only has important applications in education, training, and entertainment; but also provides a simplified simulation environment to support studies on situated language processing toward physical world interaction.

Selected Recent Papers

Language & 3D Vision

Language & 2D Vision

Language & Eye Gaze